- Published on
热门技术推文 - 2026年5月16日
- Authors

- Name
- geeknotes
科技日报:2026年5月16日早报
Today's top tech conversations are led by @kimmonismus, whose post about 'So OpenAI literally kill*d man...' garnered the highest engagement. Key themes trending across the top stories include billion, https, model, scientific, openai. The community is actively discussing recent developments in AI, engineering practices, and startup strategies.
1. kimmonismus (Group Score: 327.2 | Individual: 58.6)
Cluster: 13 tweets | Engagement: 2406 (Avg: 345) | Type: Tech
So OpenAI literally kill*d many fintech startups today
OpenAI launched a personal finance feature in ChatGPT for Pro users in the US.
You connect your bank accounts via Plaid, get a spending dashboard, and can ask GPT-5.5 questions grounded in your actual transaction data - balances, spending patterns, subscriptions, investments.
It can't see full account numbers or move money. Intuit integration is coming for things like tax estimates and credit card applications.
Financial memories store context like savings goals across conversations.
Plus users get it later, free tier eventually. They built an internal benchmark with 50+ finance professionals and say GPT-5.5 Thinking scores 79/100, GPT-5.5 Pro 82.5/100 on complex personal finance tasks.\n\nQT @ChatGPTapp: A preview for Pro users: a new personal finance experience in ChatGPT.
Pro users in the U.S. can securely connect financial accounts, see where their money is going, and ask questions based on the information they choose to connect.
Your full financial picture, now in ChatGPT. https://t.co/NjbJqOqFRi
See 12 related tweets
- @shiri_shh: OpenAI has been shipping like a startup again, @sama and the team are everywhere on the timeline.
...
- @TeksEdge: 🚀 OpenAI just launched Personal Finance in ChatGPT. Here's how it compares to Anthropic Finance.
💬 ...
- @VaibhavSisinty: OpenAI just mass murdered every personal finance startup in a single launch. 🤯
ChatGPT now connects...
- @TFTC21: OpenAI is partnering with Plaid to let ChatGPT users connect their bank accounts, credit cards, inve...
- @testingcatalog: OPENAI 🔥: A new Personal Finance feature is rolling out to Pro ChatGPT users in the US.
This featu...
2. WesRoth (Group Score: 210.5 | Individual: 35.0)
Cluster: 7 tweets | Engagement: 17 (Avg: 26) | Type: Tech
xAI has launched an early beta of Grok Build, an agentic command-line interface for coding, app building, and workflow automation.
It is currently available to SuperGrok Heavy subscribers, with xAI using the beta period to gather feedback and improve both the model and the product.\n\nQT @xai: An early beta of Grok Build, an agentic CLI for coding, building apps, and automating workflows is now available for SuperGrok Heavy subscribers.
Through this early beta, we will improve the model and product based on your feedback.
Try it at https://t.co/bpTHpjivWD https://t.co/Rlg4qMLkrv
See 6 related tweets
- @tetsuoai: 40 Grok Build agents tearing through C code in parallel. All supervised by DAD.
DAD is a lightweigh...
- @Gorden_Sun: Grok Build发布了,Terminal里使用的Agent CLI,仅限Heavy订阅用户可用。没人会去用吧?毕竟也没几个Heavy订阅用户。\n\nQT @xai: An early beta ...
- @elonmusk: Go in with expectations that Grok Build is still beta, but improving almost every day\n\nQT @myrhex:...
- @XFreeze: MASSIVE UPDATE FOR xAI DEVELOPERS
xAI is officially retiring several older models today Deprecated ...
- @Cointelegraph: 🚨 TODAY: Elon Musk's xAI unveils Grok Build, an agentic CLI for coding, building apps, and automatin...
3. MTSlive (Group Score: 193.6 | Individual: 40.1)
Cluster: 8 tweets | Engagement: 585 (Avg: 100) | Type: Tech
SITUATION DETECTED: OpenAI has announced a major reorg to unify ChatGPT and Codex.
Greg Brockman is officially taking over all OpenAI products. Head of Codex Thibault Sottiaux moves to lead core product and platform, and Head of ChatGPT Nick Turley takes on enterprise products.
See 7 related tweets
- @lennysan: I'm actually surprised how rarely these big reorgs at top AI companies happen, given their pace. The...
- @wallstengine: OPENAI FOLDS CHATGPT, CODEX AND API INTO ONE PRODUCT TEAM
WIRED reports Greg Brockman is now offici...
- @peard33: RT @ZeffMax: Scoop: OpenAI announced another major reorg on Friday, as part of its effort to unify C...
- @mark_k: News: @OpenAI is reorganizing around one very clear direction: ChatGPT and Codex are moving closer t...
- @Techmeme: OpenAI memo: Greg Brockman says he will lead product strategy as part of a reorg, folding ChatGPT, C...
4. jxnlco (Group Score: 179.7 | Individual: 46.5)
Cluster: 9 tweets | Engagement: 3866 (Avg: 189) | Type: Tech
RT @OpenAI: You've been asking for this one...
Now in preview: Codex in the ChatGPT mobile app.
Start new work, review outputs, steer execution, and approve next steps, all from the ChatGPT mobile app. Codex will keep running on your laptop, Mac mini, or devbox. https://t.co/9i2Jckjt9z
See 8 related tweets
- @mark_k: Codex Mobile is here.
@OpenAI built a remote control for your Codex sessions right into the ChatGPT...
- @AlexFinn: New Codex mobile app is awesome
I truly believe it now makes the iPad the best vibe coding device o...
- @WesRoth: OpenAI is bringing Codex into the ChatGPT mobile app in preview on iOS and Android.
Users can start...
- @ChatGPTapp: Touch grass and leave your laptop—Codex is now on your phone.
Now in preview on the ChatGPT mobile ...
- @Gorden_Sun: Codex手机版上线了,内置在ChatGPT APP里,可以远程控制电脑上的Codex。 Codex最近持续上大分!ChatGPT订阅超值!\n\nQT @OpenAI: You've been as...
5. weezerOSINT (Group Score: 116.9 | Individual: 33.8)
Cluster: 5 tweets | Engagement: 21 (Avg: 382) | Type: Tech
What would actually be good for the world is publicly funded open models, mandatory safety standards for US labs, affordable AI access for developing nations, real international governance. they appear nowhere in this document, because none of those things make Anthropic money.\n\nQT @AnthropicAI: We've published a paper that explains our views on AI competition between the US and China.
The US and democratic allies hold the lead in frontier AI today. Read more on what it’ll take to keep that lead: https://t.co/TgJBeodWYK
See 4 related tweets
- @NielsRogge: Still incredible funny to me to see how much of the current Claude models are built on top of open-s...
- @WesRoth: Anthropic published a new paper arguing that the U.S. and democratic allies currently lead China in ...
- @MatthewBerman: I wonder why Anthropic thinks 2028 is the year we need to solidify our lead against China🤔 https://t...
- @theobearman: https://t.co/cL54CAqbs0\n\nQT @AnthropicAI: We've published a paper that explains our views on AI co...
6. TeksEdge (Group Score: 116.7 | Individual: 42.3)
Cluster: 3 tweets | Engagement: 102 (Avg: 21) | Type: Tech
🚨 New Model Alert: Intern-S2-Preview (35B) from @intern_lm is an absolute monster in scientific & multimodal tasks! 🧪🔬
🤯 According to benchmarks, it beats Qwen3.6-35B and Gemma4-26B across most scientific benchmarks and leads in several general tasks. 📊
⛈️ Another Localmaxxing win! Could it be the next medium model winner and steal the thunder from Qwen3.6-35B and Gemma4-26B?
Standout wins 🏆 MicroVQA: 66.22 🏆 MolecularIQ: 57.26 🏆 SGI-Bench (Scientific Agent): 52.52 🏆 MMLU Pro: 88.00 🏆 MMMU Pro: 76.88 🏆 IMO-Bench: 84.00 🏆 MathVision: 83.36 🏆 HMMT-2026: 87.31 🏆 PinchBench (Coding Agent): 88.22
⚡ It delivers performance comparable to the trillion-parameter Intern-S1-Pro on many scientific tasks while being far more efficient.
🧬 First open-source model with material crystal structure generation capabilities + strong scientific agent performance.
💾 Note: The FP8 version is 38.5GB → best suited for dual RTX 3090/4090/5090 setups. Perfect for Strix Halo or Mac Studio 64GB.
✅ Fully open source (Apache 2.0) ✅ Supports thinking mode + tool calling ✅ Available on vLLM & SGLang
🌟A major win for open-source scientific AI!\n\nQT @intern_lm: 🥳Introducing Intern-S2-Preview, an efficient 35B scientific multimodal foundation model. 1⃣Delivers performance comparable to the trillion-scale Intern-S1-Pro on core scientific tasks. 2⃣The first open-source model with material crystal structure generation capabilities and strong general capabilities. 3⃣Significantly stronger scientific agent capabilities on multiple benchmarks. 4⃣Improves MTP acceptance rate and token generation speed via shared-weight MTP + KL loss. 5⃣CoT compression shortens responses while preserving strong reasoning , improving both performance and efficiency. 🥰Now supported by vLLM (@vllm_project) and SGLang ( @lmsysorg ) — with more ecosystem integrations on the way. 🤗Model: @huggingface https://t.co/dHXpP56xWk @ModelScope2022 https://t.co/zjfW2B0fWq 🤗GitHub: https://t.co/ImW2TzgxRh 🤗Try it now at: https://t.co/OpebPDIv5x
See 2 related tweets
- @vllm_project: 🎉 Day-0 vLLM support for Intern-S2-Preview!
Congrats to the @intern_lm team — an open-source scient...
- @lmsysorg: 🚀 Day-0 SGLang support is live for Intern-S2-Preview! This is a 35B scientific multimodal foundation...
7. aakashgupta (Group Score: 106.0 | Individual: 36.7)
Cluster: 4 tweets | Engagement: 70 (Avg: 54) | Type: Tech
This is wild. Anthropic is raising 900 billion valuation.
The company was worth $380 billion three months ago.
In September 2025, Anthropic's valuation was 380 billion. Now investors are fighting to get in at 852 billion in March.
The revenue explains the frenzy.
Anthropic's annualized run rate was 9 billion. In February 2026, 19 billion. April, 40 billion. Salesforce took 20 years to reach $30 billion in annual revenue. Anthropic did it in under three.
One product drove the acceleration. Claude Code, their AI coding tool, hit 2.5 billion. A single developer tool producing more revenue than most public SaaS companies.
The enterprise numbers confirm this isn't hype-driven consumer growth. Over 1,000 business customers now spend more than $1 million per year. That number was 500 in February. It doubled in less than two months. Eight of the Fortune 10 are paying clients.
Gross margins went from 38% a year ago to over 70% today. The bear case for AI companies has always been that compute costs eat the business. Anthropic's margins are moving in the wrong direction for the bears.
Google has committed 30 billion more tied to performance targets. Amazon committed 20 billion more. The company also signed a compute deal with SpaceX to use excess capacity from xAI's Colossus cluster.
An IPO is reportedly planned for October.
A company founded in 2021 by people who left OpenAI over safety disagreements is about to be worth a trillion dollars.
See 3 related tweets
- @MTSlive: DAILY SITUATION RECAP:
Anthropic is raising 900B. The investment will be co-led by Drago...
- @WSJ: Anthropic is raising more than $30 billion as the AI giant seeks funding ahead of its widely expecte...
- @WSJTech: Anthropic is raising more than $30 billion as the AI giant seeks funding ahead of its widely expecte...
8. dr_cintas (Group Score: 96.8 | Individual: 31.3)
Cluster: 4 tweets | Engagement: 139 (Avg: 107) | Type: Tech
Now everyone can make 60-minute films from ONE prompt 🤯
Higgsfield Supercomputer looks insane.
A cloud-native AI agent that unifies every model, tool, and creative workflow into one system.
Type one prompt - specify your length and type (cartoon or cinematic) Supercomputer picks the workflow It deploys a swarm of agents Every sub-task routes across the right frontier LLM (Opus 4.7, GPT-5.5 Pro, Gemini 3.1 Pro) and the right video model (Seedance, Veo, Kling) A finished cartoon or movie lands
A full film studio running inside one agent. Wild.\n\nQT @higgsfield_ai: Higgsfield just released Supercomputer.
A cloud-native AI agent that unifies every model, tool, and creative workflow into one system.
It can research, write, design, generate video, and ship campaigns end-to-end. https://t.co/oh1nQVlKjG
See 3 related tweets
- @AngryTomtweets: AI has gone too far... a full film studio running inside one agent.
this is what one prompt looks l...
- @dr_cintas: RT @dr_cintas: Now everyone can make 60-minute films from ONE prompt 🤯
Higgsfield Supercomputer loo...
- @heyrobinai: RT @AngryTomtweets: AI has gone too far... a full film studio running inside one agent.
this is wha...
9. seraleev (Group Score: 96.4 | Individual: 34.9)
Cluster: 3 tweets | Engagement: 64 (Avg: 52) | Type: Tech
To reach $100K/month, you only need to master one ad channel.
Don’t spread yourself thin.
Pick one. Master it. Add a second only after you hit a plateau.
I started with Google Ads, now adding Apple Search Ads.
P.S. If I were starting today, I’d pick ASA: full attribution, easy to start, fast results. TikTok didn’t work for me. Tried making videos for 1.5 months, burned out, picked channels where I’m comfortable.\n\nQT @seraleev: A simple recipe for growth (after 5 years on mobile development):
- Find your scale and what you’re done with
Ship a lot. 10, 12 apps if that’s what it takes.
Clarity comes from building, not planning.
You only learn what works by shipping.
- Pick one paid channel. Go deep.
Start with what you can afford.
Test slowly. Stay patient.
Master one channel before you touch the next.
- Reinvest. Every month. No exceptions.
A big chunk of revenue goes back into growth.
No toys. No shortcuts.
Reinvestment outlasts motivation every time.
No magic formula.
No growth hacks.
Just shipping, patience, and putting money back in month after month.
See 2 related tweets
- @seraleev: I don’t like relying on luck.
So to find what would actually get traction, I launched a 10-app chal...
- @seraleev: A simple recipe for growth (after 5 years on mobile development):
- Find your scale and what you’r...
10. PrimeIntellect (Group Score: 93.6 | Individual: 30.1)
Cluster: 4 tweets | Engagement: 248 (Avg: 45) | Type: Tech
RT @PrimeIntellect: Automating AI research is the next major step in AI
We let Claude Code (Opus 4.7) and Codex (GPT 5.5) run autonomously on the nanoGPT speedrun optimizer track using our idle compute. ~10k runs, ~14k H200 hours
Opus now holds the record at 2930 steps vs the 2990 human baseline https://t.co/B1aYxlbKMP
See 3 related tweets
- @scaling01: brutal Claude mog https://t.co/yUXk5YplV8\n\nQT @PrimeIntellect: Automating AI research is the next ...
- @scaling01: now imagine how brutal the mog is with Mythos
this is a slight update against OpenAI pulling ahead ...
- @RobertHaisfield: My guess is that the difference between Claude and GPT here:
- Claude thought big about what could u...
11. badlogicgames (Group Score: 93.6 | Individual: 61.1)
Cluster: 2 tweets | Engagement: 1106 (Avg: 80) | Type: Tech
RT @mitchellh: I strongly believe there are entire companies right now under heavy AI psychosis and its impossible to have rational conversations about it with them. I can't name any specific people because they include personal friends I deeply respect, but I worry about how this plays out.
I lived through the great MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery) reckoning of infrastructure during the transition to cloud and cloud automation. All those arguments are rearing their ugly heads again but now its... the whole software development industry (maybe the whole world, really).
It's frightening, because the psychosis folks operate under an almost absolute "MTTR is all you need" mentality: "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" We learned in infrastructure that MTTR is great but you can't yeet resilient systems entirely.
The main issue is I don't even know how to bring this up to people I know personally, because bringing this topic up leads to immediately dismissals like "no no, it has full test coverage" or "bug reports are going down" or something, which just don't paint the whole picture.
We already learned this lesson once in infrastructure: you can automate yourself into a very resilient catastrophe machine. Systems can appear healthy by local metrics while globally becoming incomprehensible. Bug reports can go down while latent risk explodes. Test coverage can rise while semantic understanding falls. Changes happens so fast that nobody notices the underlying architecture decaying.
I worry.
See 1 related tweets
- @zeeg: There is a lack of grounding with many folks. HOWEVER, counter point.
Most strong leaders arent ver...
12. wallstengine (Group Score: 92.4 | Individual: 44.4)
Cluster: 3 tweets | Engagement: 765 (Avg: 129) | Type: Tech
Bill Ackman says Pershing Square made $MSFT “a core holding” after it began building the position in February. He says Microsoft “offers compelling value.” https://t.co/xbbKftww2s
See 2 related tweets
- @StockMKTNewz: Bill Ackman just said that his hedge fund Pershing Square has built a large holding in
Microsoft $...
- @CNBC: Ackman's Pershing Square takes stake in Microsoft, citing 'compelling' valuation https://t.co/jctXYP...
13. natolambert (Group Score: 92.4 | Individual: 22.4)
Cluster: 5 tweets | Engagement: 35 (Avg: 206) | Type: Tech
I've been re-reading a lot of Bill's foundational blogs on open-source business strategy, so I was so happy that he wrote an updated version on it and wrt AI. Must read.\n\nQT @bgurley: A new @bgurley blog post!
I have been thinking about how sophisticated executives are using open source in super creative ways. Started writing this three years ago. Excited to finish it up and publish it! And with the new @p3institute brand.
See 4 related tweets
- @xeophon: amazing post and great timing w.r.t. ant's post yesterday
we must build open ai to not get locked i...
- @AndrewCurran_: From the article: https://t.co/ABsgZRjHww\n\nQT @bgurley: A new @bgurley blog post!
I have been thi...
- @zephyr_z9: Very good\n\nQT @bgurley: A new @bgurley blog post!
I have been thinking about how sophisticated ex...
- @stanfordnlp: RT @bgurley: A new @bgurley blog post!
I have been thinking about how sophisticated executives are ...
14. BrianRoemmele (Group Score: 92.2 | Individual: 52.8)
Cluster: 2 tweets | Engagement: 6494 (Avg: 491) | Type: Tech
RT @elonmusk: Critique of the 𝕏 algorithm is welcome.
There will be monthly updates of the latest algorithm to GitHub with release notes.
As reminder, you can always choose no algorithm via the Following tab.
See 1 related tweets
- @Kyrannio: RT @elonmusk: The latest 𝕏 algorithm has been published to GitHub https://t.co/ZCOm51uxmh...
15. MatthewBerman (Group Score: 90.1 | Individual: 28.2)
Cluster: 4 tweets | Engagement: 115 (Avg: 312) | Type: Tech
This is absolute BS and an attempted regulatory capture by Anthropic. The knowledge behind CBRN attacks is already online, where do you think the models learned it from??
“Compounding the problem, labs in China often release dual-use capable models as open-weight. Once a model is open-weight, safeguards that do exist can be removed, making the model available to any state or non-state actor to use for malicious purposes, including the cyber and CBRN misuse those safeguards were built to prevent.”\n\nQT @MatthewBerman: Anthropic: Chinese AI is a threat.
They've correctly identified the risks, including cheap Chinese AI capturing American businesses even when it's less capable.
But they completely blundered the solution: zero mention of an American open source strategy. In fact, they actively campaign AGAINST open source. 🤦♂️
Full breakdown of their paper from today:
See 3 related tweets
- @teortaxesTex: Would be funny if the US is willing to give the Chinese state compute for frontier AGI on the mode...
- @TheAhmadOsman: RT @MatthewBerman: This is absolute BS and an attempted regulatory capture by Anthropic. The knowled...
- @MatthewBerman: The US just cleared NVIDIA to sell H200 chips to China. Exactly the opposite of what Anthropic said ...
16. ibuildthecloud (Group Score: 85.4 | Individual: 28.0)
Cluster: 5 tweets | Engagement: 4 (Avg: 17) | Type: Tech
I have to say it's actually pretty good. I'd recommend it. But I personally hate it. But you probably won't.\n\nQT @burkeholland: GitHub just released a technical preview of the "GitHub Copilot App" - a new agentic development tool that brings agents and @GitHub together in one pane of glass.
I've been using this daily for a few weeks. I think you're going to love it. https://t.co/1q8RIkRyZS
See 4 related tweets
- @burkeholland: RT @burkeholland: GitHub just released a technical preview of the "GitHub Copilot App" - a new agent...
- @ibuildthecloud: I highly recommend the @GitHubCopilot app. I'm serious. It's the best coding desktop app I've seen y...
- @pierceboggan: RT @lichinlin: Such a fun ride helping build the GitHub Copilot app ✨
We got to explore so many det...
- @pierceboggan: RT @BradGroux: Hello, gorgeous! I'm loving the GitHub Copilot app so far. I'm always impressed with ...
17. Shipper_now (Group Score: 82.7 | Individual: 20.5)
Cluster: 5 tweets | Engagement: 76 (Avg: 47) | Type: Tech
BREAKING: Today, we pulled the plug on hiring.
I just watched my Mac become a SWE, designer, marketer, security and operations guy, all at once.
AI has already replaced us... https://t.co/dlAQDBt7hN
See 4 related tweets
- @chhddavid: Billions of jobs will just disappear and 99% people have NO idea. https://t.co/F4xgeXxRid\n\nQT @shi...
- @chddaniel: this is terrifying\n\nQT @shipper_now: BREAKING: Today, we pulled the plug on hiring.
I just watche...
- @chddaniel: Every single 9-5 worker is already replaced. https://t.co/BZ03sU3gMD\n\nQT @shipper_now: BREAKING: T...
- @chhddavid: It's so over. https://t.co/sVScZdPfn5\n\nQT @shipper_now: BREAKING: Today, we pulled the plug on hir...
18. tengyanAI (Group Score: 77.3 | Individual: 34.0)
Cluster: 3 tweets | Engagement: 28 (Avg: 45) | Type: Tech
yes, and it is not just about competing with Huawei.
if NVIDIA is not deeply on the ground in China, it risks developing a massive blind spot in the fastest-moving AI market in the world.
China is not sitting around waiting for US export policy to decide its AI future. models, apps, inference workloads, domestic chips, cloud deployments, robotics and AI-native consumer products are all moving extremely fast there.
the US still anchors the frontier narrative, but that framing is getting too narrow.
AI demand is global. AI deployment is global. The next important product patterns may not start in Silicon Valley.
for NVIDIA, China is not only a revenue pool. it is market intelligence.
lose visibility there, and you do not just lose sales. you lose the ability to see where AI is actually going.\n\nQT @benitoz: A year ago, I called Nvidia the literal bargaining chip in US-China trade.
Today the thesis is no longer mine alone.
@AnthropicAI just published "2028: Two scenarios for global AI leadership" arguing compute is the entire game. Close the smuggling loopholes, kill distillation attacks, lock in a 12-24 month US lead.
Same day, Reuters: US Commerce approved H200 sales to Alibaba, Tencent, ByteDance, https://t.co/jSpIwzQyat. Up to 75,000 chips each. Lenovo and Foxconn cleared as distributors.
Jensen is in China this week trying to convert paper approvals into actual deliveries.
The math:
China was 13% of Nvidia revenue ($17B in FY25).
Jensen at GTC: "$50B opportunity in 2025 alone, growing 50% annually."
To CNBC in October: "a couple hundred billion by the end of the decade."
That's the prize. But the prize is not just the revenue. It's the CUDA lock-in.
On Dwarkesh in April, Jensen said the moat is not the silicon. The moat is the install base. Every cloud. Every robot. Every developer trained on CUDA. If China spins up on Huawei Ascend, that is a parallel stack that compounds against Nvidia forever.
Concede the second largest compute market, you concede the ecosystem.
This is why Jensen got visibly agitated when Dwarkesh pushed. "You are not talking to somebody who woke up a loser."
It is also why Beijing is slow-walking the H200 orders. They understand the same thing in reverse. Every CUDA developer is a Huawei customer they will never get back.
The bargaining chip is leverage in both directions.
Anthropic's policy paper today is the US position. Jensen's posture is the corporate position. Beijing's go-slow is the Chinese position. All three agree on one thing: whoever owns the compute, owns the future.
Memory Wars. Co-Design. The Reasoning Tax. All downstream of this.
Compute is the unit of national power in the AI era.
$NVDA
See 2 related tweets
- @rohanpaul_ai: RT @rohanpaul_ai: Anthropic drops a paper on the US-China AI race
They believe the US and its allie...
- @peterwildeford: RT @MTSlive: We asked @peterwildeford about the probability that China overtakes the US in frontier ...
19. Hamburgerai (Group Score: 72.0 | Individual: 24.5)
Cluster: 4 tweets | Engagement: 0 (Avg: 2) | Type: Tech
AI Agent 时代,人开始给 AI 盖工位
春节在酒店里派任务,第二天早上验收几十个 commit。这个场景听起来像效率故事,但《AI 炼金术》这期播客真正有意思的地方,是它把人从执行者推到了环境构建者的位置。
说话的徐文浩有 CTO、创业者和 AI 产品实战背景;人心则从投资人和连续创业者视角追问工具、组织和产业机会。两个人聊 Claude Code、OpenClaw、非技术同事、SaaS 安全,表面很散,底层其实是一件事。
工作开始后退一层了。
徐文浩给了一个很扎眼的数字:他认为自己“提效了三五倍以上”,还在往“百倍”方向努力。这组数字来自播客口径,更适合作为现场信号,不宜直接外推成行业基准。
真正的重点在数字后面。
更关键的是他为什么提效。他没有说自己突然会写更多代码。他说大部分时间在搭基础设施,让 AI 能安全地、连续地、可验收地干活。
Agent 不缺智商,缺的是工位
这期最该记住的,超出 Claude Code 或 OpenClaw 这两个名字。工具会换,模型会换,但这个判断很稳:Agent 要真正干活,得先有一套工作环境。
徐文浩讲得很直接:“AI 已经足够聪明,但是接口没给它,权限没给它,文档没给它,上下文没给它,那是干不了啥事。”
这句话把很多 AI 讨论拉回地面。我们常把问题归因给模型能力,可现实工作里,很多任务失败在门口:它不能登录系统,不能读文档,不能看 CI 日志,不能访问项目管理工具,也不知道团队约定。
这点为什么重要?
因为“提效”的主战场已经离开单纯文本生成。提效发生在 AI 能连接 GitHub、看测试失败、读飞书里的 bug、打开浏览器截图、把结果写回项目管理工具的时候。
他用的词是 harness,可以理解成马镫、缰绳、马具。骑兵强,靠人的骑术,也靠那套装备让力量能被稳定释放。Agent 也是这样。
所以他的春节工作状态才会变成:晚上在酒店把任务布置出去,第二天早上回来验收。人从键盘前移开,任务还在跑。
真正的放手,是先把事故关进笼子
徐文浩的用法听上去很激进。他会开 dangerously skip permissions,让 Claude Code 少问授权,自己连续干几个小时。
但这套激进有护栏。
外面有几层笼子。第一层是 dev container,本地和云端都可以开。Agent 在沙箱里把环境折腾坏了,重建就行。第二层是 worktree 和分支。第三层是 PR,连 CTO 自己也不能直接合并主干。
再往外,还有自动测试和独立 AI review。写代码的 Agent 做完提交,另一个 review Agent 会读改动,判断能不能 approve。写和审不在同一个 context 里,至少能减少自我证明。
他把开发压成三个动作:
看计划,放手,验收。
问题就在这。
很多人只学到了“放手”,没学到前后的两段。计划阶段是人和 AI 对齐需求,避免它花几个小时写错方向。验收阶段看提交、看测试、看 review,看风险大的部分。
这套方法的核心,是让错误发生在可回滚、可审查、可隔离的位置。
人退后一步,判断反而更贵
人退后以后,价值没有消失,只是换了位置。过去人写代码,现在人决定任务边界、提交颗粒度、测试策略和验收标准。
有个例子很能说明问题。徐文浩让 AI 补自动化测试,测试过程中发现原代码有 bug。AI 可以顺手修,也可以先把 bug case 标出来,等测试补全后另开分支修。
他偏向第二种,因为“每一个提交应该只解决一个问题”。这里考验的是工程品味。
同样的逻辑也出现在交互测试里。人心讲了一个儿童识字打飞机游戏:小朋友对着 iPad 喊词,飞机识别后发炮。这里真正棘手的是语音识别、多音字、模糊匹配和孩子的挫败感。
徐文浩给的办法很朴素:让 AI 先像人一样操作网页,把过程记录成 markdown,再把记录转成端到端测试。语音场景还可以用 TTS 生成两千段词库音频,先跑一轮自动识别。
再往前一步。
这说明人类的新工作,正在从“告诉 AI 每一步怎么写”,转成把模糊世界拆成可测试、可复盘、可提交的小块。
组织会先多出一批虚拟员工
这期中段有一个很好的转折。人心说,现在每个人都像 CEO,做的事是招商引资、吸引优秀人才、创造工作环境。
这个比喻有实际动作。徐文浩已经真的在按这个思路做。他给 Agent 单独开公司账号:独立邮箱、独立 GitHub、独立电脑、普通开发权限。它可以慢慢操作浏览器,夜里截图、跑流程、写 report,但碰不到生产环境。
这才是真问题。
只要 AI 能替人操作电脑,公司就必须回答一堆以前不存在的问题:它用谁的身份登录?能不能看验证码?能不能读邮件?能不能碰 access key?出了错算谁的流程问题?
所以非代码场景反而很重要。它可以从飞书群收集 bug,做去重,写进项目管理工具;也可以查会议记录、读代码回答内部实现、整理报销、生成图文教程。
OpenClaw 的技巧也在这里变得有意思。徐文浩要求它遇到长任务就开 sub-agent,主进程保持响应。换句话说,一个 Agent 自己也开始有组织结构:前台接任务,后台派工。
当非技术同事也开始用 Claude Code,组织变化就更明显。技术同事除了写业务代码,还要帮其他岗位把命令行、授权、SaaS token 和教程接好。
造船,不要造塔
后半段最有投资意味的一句话,是“要造船不要造塔”。水涨以后,塔会被淹,船会变贵。
放到 Agent 语境里,这句话的意思很具体。旧软件大多围绕人来设计:人注册、人绑定信用卡、人复制 token、人点按钮、人判断权限。Agent 要自己发现服务、试用、付费、调用、审计,这整条链路都还粗糙。
徐文浩提到一个场景:某家公司提供自动 SEO 服务,Agent 能不能找到它,开通试用,拿自己的 case 测几下,觉得有效后自动付费,然后持续使用?
多加一个聊天框解决不了这件事。它需要发现协议、支付权限、试用沙箱、用量限制、日志审计和安全分层。
安全问题也会被重写。让 AI 读邮件很诱人,可邮件里可能有验证码。验证码被发给第三方 API,本质上就被交出去了。公司内部模型、本地模型、外部模型该读哪些信息,需要新的分层规则。
徐文浩把这层变化说成一句话:“你的工作往后退了一步,你不应该干活,你应该给 AI 塑造一个良好的工作环境。”
这句放在个人层面成立,放到产业层面也成立。
最后:判断一个 AI 机会,先看它服务谁
这期给我的判断标准很简单:看到一个 AI 机会,先别问它用了什么模型。先问它在服务人,还是在服务 Agent。
服务人当然还有机会,但更大的变化可能出现在后者。Agent 越多,越需要工位、账号、权限、预算、测试、支付、审计和记忆。谁能把这些东西做成基础设施,谁就在造船。
如果一个产品只是把旧界面包一层 AI,它可能只是把塔修高。水涨的时候,塔高不一定有用。
真正值得看的,是它有没有让 Agent 更安全、更便宜、更自动地完成一件事。
这也是这期最实在的地方。它没有把未来讲成魔法。它把未来讲成了一张工位表、一套权限表、一条测试流水线,以及一个人愿不愿意承认:自己已经不该亲自干那么多活了。
原文链接: https://t.co/ABuxSqI8QG
See 3 related tweets
- @GitHub_Daily: 又发现了一个好用的 Skill,一句话就能把任何内容转成播客、PPT、思维导图等格式。
支持超 15 种内容源,包括公众号、小宇宙播客、YouTube 视频、PDF、电子书等。
还能自动识别并尝试...
- @Hamburgerai: Claude Code 正在把工程师从抄写员推向作者
Boris Cherney 加入 Anthropic 后,第一个 PR 被拒了。
不是因为代码写错了,也不是因为实现出了问题,而是因为那份改动...
- @yetone: crdt + coding agent = ♾️\n\nQT @leon7hao: Codex 官方出了移动端,但 @lody_ai 从第一天就是已经假设了官方一定会出。
我们给 Lody 的定位...
20. HarryStebbings (Group Score: 69.1 | Individual: 34.7)
Cluster: 2 tweets | Engagement: 104 (Avg: 145) | Type: Tech
The Venture Capital Investment of the Decade: Cerebras
"Foundation (@vassallo) did real venture, not just capital deployment.
They found the founder early, built the relationship and incubated the company from scratch.
It was not obvious, not trendy and required conviction before the market saw it.
That is what defines a true investment of the decade." @jasonlk\n\nQT @HarryStebbings: WTF is going on? Anthropic and Elon. Cerebras IPO. Ramp at $40BN.
I sat down with @jasonlk & @rodriscoll to discuss the deal, along with the biggest news in tech this week:
- Anthropic Buys Compute From Elon & Commits $200BN to Google
- Cerebras IPO: The Breakdown
- Ramp's $40BN Latest Valuation
- Hubspot Tanks, Monday Rockets: WTF is Happening in Public Markets?
My notes below:
Foundation Made the Investment of the Decade with Cerebras Jason argues that Foundation’s success with Cerebras is a masterclass in “actual venture capital” because they did not just muscle into a hot round. They incubated the company in 2016, when the category did not even make sense. By playing the long game, finding a brilliant founder, seeding the idea, and holding roughly 9% ownership through a $40B+ IPO, they proved that the biggest returns still come from doing the hard work before a deal becomes obvious.
What Founders Have to Understand Is That to Win, You Have to Mentally Be Changed Forever There is a fundamental breakpoint around the four-to-five-year mark when a founder’s brain is permanently rewired by the intensity of the journey. Jason notes that winning at a high level requires a commitment to becoming a different person. The happy-go-lucky version of yourself from the early days is gone, replaced by someone who can often only relate to other founders who have survived similar maelstroms.
The Enemy of My Enemy Infrastructure Play Anthropic’s partnership to use SpaceX’s Colossus 1 data center highlights a massive consolidation where the strongest players are hoovering up all available capacity on the planet. For Elon Musk, this move transitions xAI from a buyer of CapEx to a net seller of capacity, turning a potential money pit into a 5 billion annual revenue stream because Grok is not currently growing at the same pace as leading-edge models.
The Crackdown on Shadow Cap Tables Anthropic is enforcing board approval for all secondary sales to reclaim cap table control and call out "bad actors". Rory warns that side contracts for "economic rights" are legally fragile; because the company has no obligation to honor unapproved transfers, many investors face "messy" losses at the IPO.
Model vs. Application: The Vertical SaaS Death Zone The industry is debating if horizontal models will consume the application layer or if vertical workflows will remain independent. Jason predicts a "terminal state of decay" for legacy marketing tools because agents have no need for manual templates. Once a model can perform an application’s core function directly within a prompt, that software becomes obsolete.
Token Maxing vs. The 100x Engineer Despite massive growth forecasts, a "micro backlash" is growing against "token trash" generated by mediocre developers. Jason predicts a clampdown on wasteful agentic spend, where companies prioritize unlimited resources for elite "100x engineers" while restricting "web heads" who burn compute for minimal productivity gains.
(links below)
See 1 related tweets
- @HarryStebbings: Token Maxing vs. The 100x Engineer
"We don’t need as many tokens as we think.
People are running C...