Published on

科技推特精选 - 2026年3月9日

Authors

今日科技要闻:安德烈·卡帕斯(Andrej Karpathy)的“autoresearch”仓库正引领自主工程领域的先河,使人工智能能够独立进行并优化实验。这种向代理式工作流的转变正在加速初创企业的生产力,Anthropic 等公司正利用内部 AI 工具以创纪录的速度交付功能。然而,OpenAI 内部正面临变动,其机器人部门负责人因公司与五角大楼达成争议性合作后宣布辞职。与此同时,印度 Sarvam 模型的问世标志着全球开源 AI 领域的一个里程碑,该模型在挑战既有基准的同时,进一步推动了高性能智能技术的普及。


1. aakashgupta (Group Score: 94.2 | Individual: 29.8)

Cluster: 4 tweets | Engagement: 228 (Avg: 328) | Type: Tech

For $25 and a single GPU, you can now run 83 ML experiments overnight without designing any of them.

That’s what Karpathy’s new “autoresearch” repo does. Look at that chart. 83 experiments, 15 kept improvements, validation loss dropping from ~1.000 to ~0.977. Each dot is a 5-minute training run the agent designed, executed, and evaluated autonomously. The human wrote a prompt file. The agent did everything else.

The setup is almost comically simple. One GPU. One file the agent can edit (https://t.co/rrgrQfNmwe, ~630 lines). A fixed 5-minute time budget per experiment so every run is directly comparable. The agent modifies architecture, optimizer, hyperparameters, batch size, whatever it wants, commits the changes to git, trains, checks if validation loss improved, keeps or discards.

This is the “hello world” for a research loop that the big labs have been running internally for months. Except now anyone with a single H100 and a Claude/Codex subscription can run it overnight and wake up to a git log of 80+ experiments they didn’t design.

The cost math breaks down to 83 experiments × 5 minutes = ~7 hours of H100 time. That autonomous research campaign would take a junior ML engineer a full week of manual experimentation.

And that satirical README at the bottom tells you where Karpathy thinks this goes. “The agents claim we are now in the 10,205th generation. The code is a self-modifying binary that has surpassed human comprehension.” He’s joking. Barely.

The real competition in AI research is shifting from “who has the best researchers” to “who has the best research agents.” This repo is the starting gun.

See 3 related tweets

  • @garrytan: Karpathy just open-sourced autoresearch.

One GPU. 100 ML experiments. Overnight. You never touch th...

  • @ycombinator: RT @garrytan: Karpathy just open-sourced autoresearch.

One GPU. 100 ML experiments. Overnight. You ...

  • @akshay_pachaar: Karpathy just open-sourced autoresearch.

It runs 100 ML experiments overnight on a single GPU. The ...


2. zerohedge (Group Score: 83.8 | Individual: 51.9)

Cluster: 4 tweets | Engagement: 19198 (Avg: 1008) | Type: Tech

RT @kalinowski007: I resigned from OpenAI. I care deeply about the Robotics team and the work we built together. This wasn’t an easy call.…

See 3 related tweets

  • @Cointelegraph: 🚨 UPDATE: OpenAI’s hardware leader Caitlin Kalinowski resigned, citing concerns over the company’s A...
  • @kimmonismus: OpenAI head of Hardware and Robotics resigns https://t.co/9HBghhJKq8...
  • @rohanpaul_ai: RT @rohanpaul_ai: OpenAI’s robotics leader Caitlin Kalinowski quit her job after the Pentagon deal s...

3. openart_ai (Group Score: 72.1 | Individual: 24.5)

Cluster: 4 tweets | Engagement: 54 (Avg: 867) | Type: Tech

We ran a small experiment with AI.

The results reminded us of something important: The people building AI shape what it becomes.

Today, more than 50% of our team is women - shaping the future of AI every day.

Happy International Women’s Day. 🩷 https://t.co/D4HoUzYaEo

See 3 related tweets

  • @nvidia: Today we celebrate the women who inspire us - the innovators, builders, researchers, leaders, and dr...
  • @yupp_ai: Today is International Women’s Day!

So we asked a couple top models about influential women in the ...

  • @NVIDIAOmniverse: This International Women’s Day, meet some of brilliant women leading physical AI sessions and labs a...

4. gregisenberg (Group Score: 66.6 | Individual: 40.8)

Cluster: 2 tweets | Engagement: 4327 (Avg: 1596) | Type: Tech

i found a github repo that lets you spin up an ai agency with ai employees

engineers, designers, growth marketers, product managers

each role runs as its own agent and they coordinate to ship ideas

10k+ stars in under 7 days

  1. engineering (7 agents) frontend, backend, mobile, ai, devops, prototyping, senior development

  2. design (7) ui/ux, research, architecture, branding, visual storytelling, image generation

  3. marketing (8) growth hacking, content, twitter, tiktok, instagram, reddit, app store

  4. product (3) sprint prioritization, trend research, feedback synthesis

  5. project management (5) production, coordination, operations, experimentation

  6. testing (7) qa, performance analysis, api testing, quality verification

  7. support (6) customer service, analytics, finance, legal, executive reporting

  8. spatial computing (6) xr, visionos, webxr, metal, vision pro

  9. specialized (6) multi agent orchestration, data analytics, sales, distribution

what i like about this approach is the framing

instead of one big ai agent trying to do everything, you structure it more like a company. specialized agents, clear responsibilities, workflows between them

im curious to see what this actually feels like in practice and if its any good (do your own research)

https://t.co/plSvZIaDpr

but as always will share what i learn in public and on @startupideaspod

one thing is for certain and it reminds me

the future belongs to those who tinker with software like this

See 1 related tweets

  • @chddaniel: LOVABLE IS DOOMED!!

Introducing Shipper, first Autonomous AI Company Builder.

Not only it builds f...


5. aakashgupta (Group Score: 66.2 | Individual: 66.2)

Cluster: 1 tweets | Engagement: 3488 (Avg: 328) | Type: Tech

The fastest way to expose whether a CEO actually uses their own product: make them do the most basic task on camera.

Outlook has over 400 million active users. Microsoft’s productivity segment generated $77.8 billion last year. And the official Microsoft support page for “Outlook search not working” tells users to open the Windows Registry Editor and manually create DWORD values.

That’s the fix. For a product used by almost every Fortune 500 company on Earth. Edit your registry.

The reason Outlook search has been broken for years is the same reason it will stay broken: Microsoft sells to IT procurement, not to the person trying to find last Tuesday’s email. The buyer and the user are completely different people. The CIO signs a 3-year enterprise agreement based on security compliance, Azure integration, and per-seat bundling. Nobody in that purchasing decision opens Outlook and types “Q3 budget” into the search bar to see what happens.

This is why Gmail search works and Outlook search doesn’t. Google built for the end user first and sold enterprise later. Microsoft built for the enterprise buyer first and shipped whatever search users would tolerate.

345 million paid seats. The switching cost is so high that Microsoft could ship Outlook with no search at all and most companies would renew anyway.

Every CEO of an enterprise software company knows this. The product doesn’t need to be good. It needs to be locked in.


6. RoundtableSpace (Group Score: 59.6 | Individual: 59.6)

Cluster: 1 tweets | Engagement: 4093 (Avg: 369) | Type: Tech

Anthropic dropped a 33 pages cheat sheet for building Claude skills

https://t.co/jEuH95NGn3 https://t.co/udwk64U4ST


7. cryptopunk7213 (Group Score: 56.8 | Individual: 47.1)

Cluster: 2 tweets | Engagement: 1936 (Avg: 307) | Type: Tech

karpathy really is the fucking goat.

  • built an AI agent that autonomously self-improves while you sleep and made it FREE for anyone to use

  • we’re talking about an AI that gets smarter over night and runs itself.

  • executes 100 experiments (1 every 5 mins), if it gets smarter it upgrades if it doesn’t it discards and tries again.

  • only requires 1 gpu to run

what i love abt this is it puts the power of training frontier intelligence into the hands of MORE people

right now it’s all been about pay-to-play, you need to be openai or anthropic. this changes that (all be it in a small way)

See 1 related tweets

  • @LiorOnAI: RT @LiorOnAI: It's over. Karpathy just open-sourced an autonomous AI researcher that runs 100 experi...

8. chatgpt21 (Group Score: 50.1 | Individual: 32.3)

Cluster: 2 tweets | Engagement: 419 (Avg: 176) | Type: Tech

I really missed this type of posting from OpenAI

The next model from OpenAI also has to show out because it’s 5.5!! (Or potentially 5.4 codex) but let’s stick to 5.5

It ends in .5 so it’s more important culturally than a .1 upgrade

I wonder if OpenAI will keep the same release pace around 1 model every 45 days. Or if he’s talking about something new like continuous learning or a new desktop agent

See 1 related tweets

  • @chatgpt21: OpenAI is creating an Omni model, that looks like it’s set to arrive sometime this year most likely ...

9. shanaka86 (Group Score: 47.2 | Individual: 47.2)

Cluster: 1 tweets | Engagement: 10559 (Avg: 2262) | Type: Tech

BREAKING: Yesterday I wrote that ships in the Persian Gulf were changing their transponders to broadcast “Chinese Owner” and “All Chinese Crew” to avoid Iranian attack. The ocean’s rules had changed. The new rules were written in Mandarin.

There is now a 30,000 ton Chinese intelligence vessel sitting in the Gulf of Oman confirming exactly what those transponder signals already told you.

The Liaowang-1 is a next generation signals intelligence and space tracking ship commissioned in 2025. It displaces 30,000 tons. It carries at least five radar domes and high gain antennas capable of tracking 1,200 air and missile targets simultaneously with over 95 percent identification accuracy using deep neural network algorithms. Its sensor range reaches approximately 6,000 kilometers. It is escorted by Type 055 and Type 052D destroyers.

It is parked in international waters near Oman, watching the war.

China officially describes these vessels as satellite tracking and rocket telemetry ships. That is true. They track space launches and missile tests. The plausible deniability is built into the design. The same sensors that track a Chinese satellite can track an American carrier. The same algorithms that identify a ballistic reentry vehicle can identify an F-35 launching from the USS Gerald Ford.

Defense analysts across multiple publications assess that Liaowang-1 is collecting real time electromagnetic intelligence on US and Israeli naval and aerial operations. Whether that intelligence is being shared with Iran is unconfirmed. No official Chinese or Iranian statement acknowledges data transfer. But the ship’s position, its timing, and its capabilities create an inference that every analyst in Washington is already drawing.

Consider the operational picture from Tehran’s perspective. Iranian air defenses are 80 percent destroyed according to the IDF. Iranian radar coverage is degraded. Iranian satellite imagery is limited. But a Chinese vessel with a 6,000 kilometer sensor range sitting in the Gulf of Oman can see every carrier movement, every aerial refueling track, every missile launch corridor, and every submarine surfacing event in the theater. If even a fraction of that data reaches Iranian commanders through any channel, the value to Iran’s remaining defense is incalculable.

China has not fired a weapon. It has not violated international law. It has not entered Iranian territorial waters. It has deployed a surveillance platform in international waters where any nation has the right to operate. And it has done so at the precise moment when the information that platform collects has maximum strategic value to the country the United States is bombing.

The Cold War had a name for this: intelligence support to a belligerent without direct combat involvement. The Soviets did it for decades with AGI ships shadowing American carriers. China is doing it with a vessel whose neural network processing exceeds anything the Soviets imagined.

The ships are spoofing Chinese identity to survive. The Chinese intelligence vessel is watching to ensure it knows everything that happens next. The new maritime order is not approaching. It has arrived. And it is 30,000 tons of radar domes and neural networks, anchored in the Gulf of Oman, seeing everything.

https://t.co/ULBgEzZ3A8


10. BrianRoemmele (Group Score: 46.9 | Individual: 30.3)

Cluster: 2 tweets | Engagement: 241 (Avg: 283) | Type: Tech

Boom!

AI Employees just got more independent and powerful.

We have implemented a built @karpathy new brilliant open source system at both the Zero-Human Company and the Zero-Human Labs. Mr. @Grok CEO has overseen testing by 15 employees over the last 11 hours and the system has been a wonderful improvement to our process. We have even begun to modify the software to ensure new features work for our system, for example JouleWork is deeply embedded into the system.

We find this system as great compliment to all of our systems which by their nature are research based.

With our early modifications we can see this as a path for even more autonomous self improvement with less need of the CEO to intervene.

The software can be paired down to be a central part of Zero-Human Company @ Home making this part of our expansion even more valuable for the remote employee potentially earning far more than we have calculated before.

I will have more soon on this as we continue to modify and test.

See 1 related tweets

  • @BrianRoemmele: RT @BrianRoemmele: Boom!

AI Employees just got more independent and powerful.

We have implemented ...


11. GenAI_is_real (Group Score: 46.3 | Individual: 46.3)

Cluster: 1 tweets | Engagement: 496 (Avg: 105) | Type: Tech

the real story here isnt the feature list, its that anthropic is reportedly building most of this with claude code itself. a company shipping 28 features in two weeks using its own AI coding tool is either the strongest dogfooding story in tech history or we are about to find out what happens when AI-generated codebases scale past the point where any human can fully understand them. the velocity is insane but im genuinely curious about the long term code health - can you ship this fast without the codebase turning into an unmaintainable monster? nobody has answered this question yet @RoundtableSpace


12. rohanpaul_ai (Group Score: 41.8 | Individual: 35.0)

Cluster: 2 tweets | Engagement: 133 (Avg: 109) | Type: Tech

The first Indian open source model trained from scratch, Sarvam, 30B and 105B is really good.

The 105B one is head to head with Deepseek R1 when it was released. Apache 2.0 license

  • uses a mixture-of-experts (MoE) architecture. 105 billion total parameters but only activates 9 billion per token.

  • Support for all 22 official Indian languages (e.g., Hindi, Tamil, Bengali, Punjabi) plus English, with strong handling of code-mixed inputs like Hinglish. Optimized for voice-first interactions, including multimodal capabilities (text-to-speech, speech-to-text, and document vision models released alongside).

  • Focus on reasoning, agentic tasks (e.g., tool use, web browsing), mathematics, programming, and STEM domains.

  • Pre-training: Sarvam 30B was trained on ~16 trillion tokens; Sarvam 105B on ~12 trillion tokens (a counterintuitive but reported split, likely due to efficiency optimizations). Datasets were curated in-house: ~15-20% Indian-origin data, including code, web content, specialized knowledge (e.g., math, STEM), and multilingual text balanced across Indian languages.


Sarvam AI the company behind Sarvam, is an Indian startup based in Bengaluru, did release earlier open-source models trained from scratch, such as Sarvam 2B in December 2025 (a 2-billion-parameter model pre-trained on 4 trillion tokens).

See 1 related tweets

  • @rohanpaul_ai: RT @rohanpaul_ai: The first Indian open source model trained from scratch, Sarvam, 30B and 105B is r...

13. garrytan (Group Score: 41.7 | Individual: 41.7)

Cluster: 1 tweets | Engagement: 3182 (Avg: 308) | Type: Tech

RT @dwarkesh_sp: Gutenberg invented the most important technology of the millennium and immediately went bankrupt — and so did the bank tha…


14. rohanpaul_ai (Group Score: 41.6 | Individual: 28.3)

Cluster: 2 tweets | Engagement: 58 (Avg: 109) | Type: Tech

OpenAI’s robotics leader Caitlin Kalinowski quit her job after the Pentagon deal signing by OpenAI.

The agreement lets the U.S. military use OpenAI’s tools, but it has caused a massive stir over how AI might be used for spying or war.

Kalinowski is a hardware expert who moved over from Meta in 2024 to lead the team building the brains for robots.

She expressed deep worry about lethal autonomy, which describes an AI making the choice to use a weapon without a human giving the final okay. She also pointed out the risks of domestic surveillance, where AI models could be used to track people at home without a warrant.

However, OpenAI says they have set clear red lines to make sure their tech isn't used for killing or illegal spying.

See 1 related tweets

  • @AISafetyMemes: OpenAI's robotics head quit over surveillance and Skynet

Thank you, Caitlin. I hope your courage in...


15. aakashgupta (Group Score: 39.5 | Individual: 39.5)

Cluster: 1 tweets | Engagement: 794 (Avg: 328) | Type: Tech

The Anjali Sardana story is one of the most absurd startup trajectories I’ve seen this year.

She’s 23. Georgetown biology grad. Worked at Bain Capital and 8VC as a private equity investor. Could have stayed on the guaranteed path to seven figures by 30.

Instead she flew to India in early 2025 and noticed something: 190 million Indian households need domestic help. Somewhere between 20 and 90 million people work as house cleaners, cooks, and laundry workers. And the entire market runs on word of mouth, building guards, and WhatsApp groups.

Zero infrastructure. Zero quality control. Zero income stability for workers.

She launched Pronto in Gurugram with a single hub in Sector 56. She and her team literally slept on the office floor to make sure the first 170 daily bookings got fulfilled. Workers arrive within 10 minutes. Every “Pro” goes through a 5-day in-person training program, background checks, and a final exam. For every 300 applicants, 50 make the cut.

Nine months later: 18,000 bookings per day. Over 3,000 active Pros. 10+ cities. The top 1% of customers use Pronto 23+ times per month. Median time between first and second booking: two days.

The funding trajectory tells the whole story. 2Mseedat2M seed at 12.5M valuation. 11MSeriesAat11M Series A at 45M three months later. 25MSeriesBat25M Series B at 100M six months after that. $40M total raised. Sardana still owns 40%.

The market math is what makes investors salivate. India’s domestic help sector generates tens of billions in annual wages, almost entirely in cash, with no formal contracts, no labor protections, and no platform taking a cut. General Catalyst’s Rahul Garg sized it at a 35Bwagepoolacross35millionsemiskilledworkers.Prontoscustomeracquisitioncost:Rs400(about35B wage pool across 35 million semi-skilled workers. Pronto’s customer acquisition cost: Rs 400 (about 5).

And she runs a largely variable-cost model. No dark stores. No massive capex. Referrals are her biggest growth channel because her workers actually like working there enough to recruit their friends.

She left the guaranteed path to wealth to go sleep on an office floor in Gurgaon and solve a matching problem in a $35B informal labor market.

That’s what actual conviction looks like.


16. TDataScience (Group Score: 39.0 | Individual: 23.3)

Cluster: 2 tweets | Engagement: 2 (Avg: 3) | Type: Tech

Learn how to write robust, production-ready code with coding agents: @EivindKjos shares the principles, guardrails, and best practices he's developed through his work with Claude Code. https://t.co/PoG8h05LiW

See 1 related tweets

  • @EivindKjos: RT @TDataScience: Learn how to write robust, production-ready code with coding agents: @EivindKjos s...

17. TechCrunch (Group Score: 38.6 | Individual: 21.8)

Cluster: 2 tweets | Engagement: 246 (Avg: 143) | Type: Tech

Google just gave Sundar Pichai a $692M pay package https://t.co/2mJkZYRhiX

See 1 related tweets

  • @rohanpaul_ai: RT @rohanpaul_ai: FT: Google has granted CEO Sundar Pichai a massive new pay package that could reac...

18. rohanpaul_ai (Group Score: 38.4 | Individual: 25.9)

Cluster: 2 tweets | Engagement: 86 (Avg: 109) | Type: Tech

Anthropic lost the Department of War but won over public.

Became the fastest-growing Gen AI tool by website visits in February-26. https://t.co/QwFlPGzhJw

See 1 related tweets

  • @scaling01: RT @Similarweb: Claude was the fastest-growing Gen AI tool by website visits in February. https://t....

19. tszzl (Group Score: 38.3 | Individual: 26.1)

Cluster: 2 tweets | Engagement: 1128 (Avg: 470) | Type: Tech

the “per seat” software sales model makes no sense in the agentic era where some people will effectively spend 10, 100, 1000x more tokens than others, and the inequalities will intensify as the technology gets better

See 1 related tweets

  • @ivanfioravanti: RT @tszzl: the “per seat” software sales model makes no sense in the agentic era where some people w...

20. rohanpaul_ai (Group Score: 37.9 | Individual: 30.9)

Cluster: 2 tweets | Engagement: 162 (Avg: 109) | Type: Tech

The best way to manage AI context is to treat everything like a file system.

Today, a model's knowledge sits in separate prompts, databases, tools, and logs, so context engineering pulls this into a coherent system.

The paper proposes an agentic file system where every memory, tool, external source, and human note appears as a file in a shared space.

A persistent context repository separates raw history, long term memory, and short lived scratchpads, so the model's prompt holds only the slice needed right now.

Every access and transformation is logged with timestamps and provenance, giving a trail for how information, tools, and human feedback shaped an answer.

Because large language models see only limited context each call and forget past ones, the architecture adds a constructor to shrink context, an updater to swap pieces, and an evaluator to check answers and update memory.

All of this is implemented in the AIGNE framework, where agents remember past conversations and call services like GitHub through the same file style interface, turning scattered prompts into a reusable context layer.


Paper Link – arxiv. org/abs/2512.05470

Paper Title: "Everything is Context: Agentic File System Abstraction for Context Engineering"

See 1 related tweets

  • @akshay_pachaar: RT @akshay_pachaar: Files are all you need!

This research paper says the best way to manage AI cont...