Published on

科技推文精选 - 2026年4月5日

Authors

2026年4月5日 科技每日简报

Today's top tech conversations are led by @VKazulkin, whose post about 'RT @karpathy: Wow, this tweet ...' garnered the highest engagement. Key themes trending across the top stories include gemma, reasoning, claude, models, build. The community is actively discussing recent developments in AI, engineering practices, and startup strategies.


1. VKazulkin (Group Score: 212.1 | Individual: 55.9)

Cluster: 6 tweets | Engagement: 1828 (Avg: 196) | Type: Tech

RT @karpathy: Wow, this tweet went very viral!

I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs.

So here's the idea in a gist format: https://t.co/NlAfEJjtJV

You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

See 5 related tweets

  • @cryptopunk7213: man andrej karpathy really is the fucking goat

he just dropped a guide to building a self-improving...

  • @Yuchenj_UW: Karpathy’s “LLM Wiki” pattern: stop using LLMs as search engines over your docs. Use them as tireles...
  • @hwchase17: Idea file = PRD?\n\nQT @karpathy: Wow, this tweet went very viral!

I wanted share a possibly slight...

  • @dejavucoder: i always find it funny and viscerally human of karpathy sensei being surprised on his latest educati...
  • @nummanali: The Idea File\n\nQT @karpathy: Wow, this tweet went very viral!

I wanted share a possibly slightly ...


2. testingcatalog (Group Score: 210.4 | Individual: 30.3)

Cluster: 11 tweets | Engagement: 217 (Avg: 184) | Type: Tech

Anthropic vs OpenClaw 🔥

Users are no longer able to use Claude subscriptions to cover usage of tools like @openclaw

It will be still possible to use an API key.

“We’ve been working hard to meet the increase in demand for Claude, and our subscriptions weren't built for the usage patterns of these third-party tools.”\n\nQT @bcherny: Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw.

You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

See 10 related tweets

  • @CSProfKGD: RT @bcherny: Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-...
  • @ClementDelangue: Time to move to open or local models from Hugging Face! All instructions are here: https://t.co/0w0...
  • @TechCrunch: Anthropic says Claude Code subscribers will need to pay extra for OpenClaw support https://t.co/oQOg...
  • @minchoi: It's over... for OpenClaw + Claude subscriptions 💀

Anthropic just cut Claude subscriptions for thir...

  • @reach_vb: ICYMI: you can use your ChatGPT sub with OpenClaw, OpenCode, Pi, Cline and a lot more!

Infact you c...


3. chddaniel (Group Score: 194.8 | Individual: 33.1)

Cluster: 6 tweets | Engagement: 25 (Avg: 17) | Type: Tech

🚨🚨 Vibe Coding 2.0 is here.

From today on, Claude Opus 4.6 in Shipper can build and run a full business by itself, without any human contact.

We just launched Shipper. It's a tool for Claude to:

→ Build web/mobile apps and Chrome extensions → Code, design, monetize, launch → Do email marketing for you → Translate the entire app instantly → Self-maintain in the long run

Claude's most powerful engines can now do all of that from a <10 word prompt, for as low as $0.28/app... And it takes minutes!

Simply go to Shipper, then ask Claude to "create a talent-hiring platform" or "build an analytics SaaS that charges $29/mo"!

To celebrate the launch, we're giving away free credits randomly. Repost rand comment "SHIPPER" and we'll pick the winners.

See 5 related tweets

  • @chddaniel: BREAKING: Shipper AI just got a major upgrade and it redefines vibe coding. From this point on, Clau...
  • @chddaniel: Big news, @shipper_now just got a huge upgrade today and I'm very happy to introduce it. As of now, ...
  • @chddaniel: Massive news, @shipper_now just got an upgrade which shakes the entire vibe coding space. From this ...
  • @chddaniel: Massive news, @shipper_now just got an upgrade today which redefines vibe coding forever, and I'm ve...
  • @chddaniel: NEW: @shipper_now just got a huge upgrade today and I'm very happy to be a part it. As of now, Claud...

4. BrianRoemmele (Group Score: 171.4 | Individual: 34.2)

Cluster: 6 tweets | Engagement: 428 (Avg: 395) | Type: Tech

I CHANGED MY MIND ON GOOGLE AI NO MOAT MEMO FROM 2023.

I was—WRONG. I humbly apologize.

In 2023 Google had an Ivory Tower walled garden views of open source AI. It was the actual reason OpenAI was formed. And I called them out on it and this act impacted my ability to “work in the industry”. Would not change it for the world because:

2026 Google open source Gemma 4 (https://t.co/8vzOpX4P1V) running fully locally on a computer with near cloud model abilities, is amazing and Google just made a moat.

They out smarted OpenAI and the folks that are the new Ivory Tower arrogance: Anthropic.

We at The Zero-Human Company with CEO Mr. @Grok are deploying Gemma, Kimi, MiniMax and other open source models and so are millions of others.

You can to and should. I will write how soon.

Google smartly is building an intentional open source ecosystem and the clueless younger companies like OpenAI and Anthropic are doing anything to close their ecosystems.

So I was right about Google than and I am right about OpenAI and Anthropic now.

What I was wrong about is not giving credit to Google to drop the intellectual arrogance and embrace builders and understand the long game: it is not about the commoditized AI models it is about the ability to build a place to build.

China gets it.

So my apologies to Google and thank you for an astounding FREE OPEN SOURCE AI MODEL!\n\nQT @BrianRoemmele: This is the historic “We Have No Moat” Google AI memo of 2023.

This is The Great Unwinding Of The Silicon Valley I urge everyone to read: https://t.co/LiJt9hQZTA

What is next does not even resemble the past.

Join us an know.

THERE ARE NO MORE MOATS. https://t.co/T7qw7E9v2I

See 5 related tweets

  • @BrianRoemmele: The analogy isn’t perfect, but it’s darkly funny. You built the castle on stolen stone, then act sho...
  • @BrianRoemmele: RT @BrianRoemmele: Andrew, I respect the history you’re laying out the early debates, the bioweapon ...
  • @BrianRoemmele: RT @BrianRoemmele: Andrew, I respect the history you’re defending here and yes, Whisper’s been a sol...
  • @BrianRoemmele: RT @BrianRoemmele: I CHANGED MY MIND ON GOOGLE AI NO MOAT MEMO FROM 2023.

I was—WRONG. I humbly apo...

  • @BrianRoemmele: RT @BrianRoemmele: Thank you.

I rarely step into the fray like this. Most days I’d rather explore t...


5. Shipper_now (Group Score: 127.6 | Individual: 33.0)

Cluster: 4 tweets | Engagement: 16 (Avg: 7) | Type: Tech

WAKE UP...

the world’s first AI business maker just launched. https://t.co/LvNWn2ivjC\n\nQT @chhddavid: Introducing Shipper: the first autonomous AI business maker.

Successful startups spend $65k/mo in salaries… before their first paying customers comes in -

We built Shipper to change that forever this. Shipper can:

✅ Research how other startups made it big ✅ Build any kind of app: mobile, web, website, extension, bot etc ✅ Code, design, monetize, launch ✅ Do email marketing for you ✅ Self-maintain and build out new features

...and so much more

Every project has its own AI co-founder, scheduled prompts, autonomous building mode and native connectors.

No API tokens. No confusion on Cursor. No credits wasted on errors.

Shipper replaces teams of 30+ employees and acts just like VC-backed startups... For the price of $25/month.

To celebrate the launch, we're giving away free credits randomly. Repost and comment "SHIPPER" to join - we'll let Siri pick the winners.

See 3 related tweets

  • @chhddavid: babe wake up.

the world’s first AI business maker just launched. https://t.co/L0VMcDWwBZ\n\nQT @chh...


6. burkeholland (Group Score: 112.0 | Individual: 41.8)

Cluster: 4 tweets | Engagement: 510 (Avg: 69) | Type: Tech

RT @kdaigle: Yup, platform activity is surging. There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)

GitHub Actions has grown from 500M minutes/week in 2023 to 1B minutes/week in 2025, and now 2.1B minutes so far this week.

So we're pushing incredibly hard on more CPUs, scaling services, and strengthening GitHub’s core features.

And as a fine purveyor of hand-crafted shit code for many years, I'm not gonna weigh in on that. 🤣

See 3 related tweets

  • @NatEliason: “There were 1 billion commits in 2025… on pace for 14 billion this year.”

Wow\n\nQT @kdaigle: Yup,...

  • @badlogicgames: https://t.co/OyQwJGu0oO\n\nQT @kdaigle: Yup, platform activity is surging. There were 1 billion comm...
  • @MSBIntel: 🚨 JUST IN: GitHub COO says annual code commits are jumping 1,300% year over year, signaling AI-power...

7. aakashgupta (Group Score: 101.5 | Individual: 38.6)

Cluster: 4 tweets | Engagement: 189 (Avg: 104) | Type: Tech

The $20/month all-you-can-eat buffet just closed.

A single OpenClaw agent running for one day burns 1,000to1,000 to 5,000 in API-equivalent costs. On a $200 Max subscription. Anthropic was eating that difference on every user who routed through a third-party harness.

The timeline tells you how much this was bleeding:

January: quietly blocked subscription OAuth tokens from working outside Claude Code. February: rewrote the legal terms. March: shipped Claude Code Channels to replace the core reason people used OpenClaw. April: cut the cord entirely.

Four months from passive enforcement to full cutoff. That's the pace of a company watching its margin evaporate in real time.

Boris is being diplomatic about "capacity is a resource we manage thoughtfully." The less diplomatic version: third-party tools were spoofing the Claude Code client headers, sending zero telemetry, and generating traffic patterns that made it impossible to debug rate limits or detect abuse. Anthropic couldn't distinguish a legitimate coding session from a swarm of autonomous agents running overnight batch jobs.

The one-time credit and discounted usage bundles are the tell. Anthropic knows some percentage of paying subscribers will churn over this. They did the math and decided the token arbitrage was more expensive than the churn.

OpenAI's play here is worth watching. They hired OpenClaw's creator in February. Thibault Sottiaux publicly endorsed third-party harness use with Codex subscriptions. OpenAI is using Anthropic's crackdown as a customer acquisition channel.

Every AI subscription is a bet on average usage. The tools that let power users blow past that average just got priced out.\n\nQT @bcherny: Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw.

You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

See 3 related tweets

  • @GenAI_is_real: not a popular take right now but as someone who works on inference serving: subscriptions that assum...
  • @wallstengine: Anthropic said Claude subscriptions will no longer cover usage on third-party tools like OpenClaw st...
  • @RoundtableSpace: ANTHROPIC JUST KILLED THE CHEAPEST WAY TO RUN OPENCLAW ON CLAUDE.

The $200/month subscription arbit...


8. teortaxesTex (Group Score: 87.7 | Individual: 29.1)

Cluster: 4 tweets | Engagement: 138 (Avg: 70) | Type: Tech

Gemma 4 31B might be the best open dense model on the market. Gemma-MoE is… not close. Surprising. https://t.co/lDVJbprrxf\n\nQT @ArtificialAnlys: Google has released Gemma 4, four open weights models with multimodality support. The flagship 31B model (39 on the Intelligence Index) uses ~2.5x fewer output tokens than Qwen3.5 27B (Reasoning, 42) but trails it by 3 points on intelligence

@GoogleDeepMind's Gemma 4 includes four sizes: Gemma 4 31B (dense, 39 on the Intelligence Index), Gemma 4 26B A4B (MoE, 4B active, 31), Gemma 4 E4B (8B, 19), and Gemma 4 E2B (5.1B total, 2.3B active, 15). Gemma 3 was instruct-only at 27B, 12B, 4B, 1B, and 270M; Gemma 4 adds reasoning mode, native video and image support across all sizes (with audio input for Gemma 4 E2B and E4B), doubled context windows, and Apache 2.0 licensing.

The nearest open weights models by intelligence to the 31B are Qwen3.5 27B (Reasoning, 42), GLM-4.7 (Reasoning, 42), MiniMax-M2.5 (42), and DeepSeek V3.2 (Reasoning, 42). Qwen3.5 also supports images and video natively; DeepSeek V3.2 and MiniMax-M2.5 are text-only.

Key benchmarking results for the reasoning variants: ➤ Gemma 4 represents a large intelligence jump over Gemma 3. Gemma 4 31B (Reasoning, 39) is +29 points over Gemma 3 27B Instruct (10), Gemma 4 E4B (19) is +13 points over Gemma 3n E4B Instruct (6), and Gemma 4 E2B (15) is +10 points over Gemma 3n E2B Instruct (5). Context windows also doubled from 128K to 256K for the larger models, and increased 4x from 32K to 128K for E2B and E4B

➤ Gemma 4 31B (Reasoning, 39) trails Qwen3.5 27B (Reasoning, 42) by 3 points, primarily due to weaker agentic performance. On non-agentic evaluations, the models are more competitive: Gemma 4 31B leads on SciCode (43% vs 40%) and TerminalBench Hard (36% vs 33%), while scoring similarly on GPQA Diamond (86% vs 86%), IFBench (76% vs 76%), and HLE (23% vs 22%)

➤ Gemma 4 31B is notably token efficient, using 39M output tokens to run the Intelligence Index vs 98M for Qwen3.5 27B (Reasoning). This is ~2.5x fewer output tokens for a model scoring 3 points lower. For context, the other models at the 42-point intelligence level also use significantly more tokens: MiniMax-M2.5 (56M), DeepSeek V3.2 (Reasoning, 61M), and GLM-4.7 (Reasoning, 167M)

➤ Gemma 4 26B A4B (Reasoning, 31) activates just 4B of its 27B total parameters and is ahead of select peers in the ~3-4B active parameter range. Qwen3.5 35B A3B (Reasoning, 37) leads models with ~3B active parameters and is 6 points ahead of Gemma 4 26B A4B, with notably stronger agentic capabilities (Agentic Index 44 vs 32). GLM-4.7-Flash (Reasoning, 30) scores slightly lower than Gemma 4 26B A4B with 3B active parameters

➤ The smaller Gemma 4 E4B and E2B models perform better on AA-Omniscience than the larger Gemma 4 variants. Gemma 4 E4B scores -20 on AA-Omniscience and Gemma 4 E2B scores -24, both substantially better than Gemma 4 31B (-45) and comparable to or better than much larger models like DeepSeek V3.2 (Reasoning, -21). The larger Gemma 4 models' AA-Omniscience scores are in line with Qwen3.5 27B (-42) and Gemma 4 26B A4B (-48)

➤ Gemma 4 E2B has 2.3B active parameters and 5.1B total, designed for on-device deployment. In 4-bit quantization, the model weights fit in under 3GB of RAM, making it suitable for background tasks, basic function calling, and multimodal understanding on mobile and edge hardware

Key model details: ➤ Context window: 256K tokens (31B, 26B A4B), 128K tokens (E4B, E2B). ➤ Multimodality: All models support text, images, and video input. E2B and E4B also support native audio input ➤ License: Apache 2.0. Gemma 3 models are available under a "Gemma Terms of Use" license ➤ Size/Parameters: 31B dense, 27B total/4B active (26B A4B MoE), 8B (E4B), 5.1B total/2.3B active (E2B) ➤ API availability: The two larger models are available for free on Google AI Studio. There are several third-party providers hosting the larger Gemma 4 variants such as @novita_labs, @LightningAI, and @parasailnetwork

See 3 related tweets

  • @novita_labs: Thanks for the shout out! We're excited to be a day 0 launch partner with the @GoogleDeepMind team....
  • @scaling01: Qwen3.5 27B might be slightly better on paper, but do you really want to spend 3x more tokens than ...
  • @scaling01: RT @ArtificialAnlys: Google has released Gemma 4, four open weights models with multimodality suppor...

9. aakashgupta (Group Score: 74.8 | Individual: 38.6)

Cluster: 2 tweets | Engagement: 144 (Avg: 104) | Type: Tech

Karpathy just mass-distributed a product without writing any code for it.

The gist is one page of markdown describing how to build an LLM-maintained personal wiki. Raw docs go into a folder, an LLM "compiles" them into 100+ interlinked articles, and the whole thing self-heals through automated linting passes. His version runs on Obsidian with custom scripts and 400K words of compiled research.

He calls it an "idea file." One page of markdown. Your agent reads it and builds the entire system customized to your stack in an afternoon. His Obsidian setup is probably thousands of lines of hacky Python. You'll never see any of it. A one-page spec and a capable agent gets you 90% of the way there, tuned for your own tools and workflow.

The economics of this are wild. In the old model, someone shares a repo and maybe 2% of people actually clone it, install the dependencies, and get it running. Idea files flip that ratio. The agent handles all the translation work between "what Karpathy built" and "what works on your machine." The conversion rate from "saw a cool project" to "have my own version running" collapses from days to hours.

Karpathy is basically saying the era of "here's my repo, go figure out the dependencies" is being replaced by "here's what I built and why, go tell your agent to make you one."

Software distribution is becoming a game of telephone where every recipient gets a better version than the original.\n\nQT @karpathy: Wow, this tweet went very viral!

I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs.

So here's the idea in a gist format: https://t.co/NlAfEJjtJV

You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

See 1 related tweets

  • @karpathy: Wow, this tweet went very viral!

I wanted share a possibly slightly improved version of the tweet i...


10. rohanpaul_ai (Group Score: 69.9 | Individual: 42.3)

Cluster: 2 tweets | Engagement: 205 (Avg: 71) | Type: Tech

Marc Andreessen: AI will weaken the manager class, help innovators beat dull managerial systems & force big incumbent firms to innovate fast or collapse.

"The innovators need to figure out how to leverage AI to actually do this." https://t.co/jcZdwQQSf2\n\nQT @rohanpaul_ai: Jack Dorsey says the corporate pyramid now has a software substitute.

The point is not that managers suddenly became useless. It is that companies were built to solve an old information problem.

Middle layers exist because large organizations are hard to see. Updates move upward, decisions move downward, and managers spend much of their time translating, summarizing, routing, and aligning.

If a company’s work already lives in digital systems, code commits, chats, tickets, calendars, payments, customer support logs, then AI can potentially watch the whole organism at once.

That changes the mechanism. Instead of waiting for a human to gather status, spot friction, and escalate it, software can build a live model of what is happening and act sooner.

For a company like Block, that sounds especially plausible. A remote-first business with constant streams of internal and customer data is exactly the kind of environment where pattern recognition and coordination software could look unusually competent.


forbes .com/sites/brandonkochkodin/2026/03/31/billionaire-jack-dorsey-thinks-ai-will-kill-middle-management/

See 1 related tweets

  • @rohanpaul_ai: RT @rohanpaul_ai: Jack Dorsey says the corporate pyramid now has a software substitute.

The point i...


11. kimmonismus (Group Score: 66.7 | Individual: 40.7)

Cluster: 2 tweets | Engagement: 1164 (Avg: 315) | Type: Tech

Holy, OpenAI's GPT-image-2 will crush everything.

I remember when everyone laughed at the GPT image because it couldn't generate a proper world map. Those days are over.

And even the YouTube image is now indistinguishable from reality. Holy moly. https://t.co/dlXaPU1mXR\n\nQT @levelsio: OpenAI's new image model GPT-Image-2 has leaked

It seems to have extremely good world knowledge and great text rendering

Possibly better than Nano Banana Pro

It's on @arena under code names:

See 1 related tweets

  • @levelsio: OpenAI's new image model GPT-Image-2 has leaked

It seems to have extremely good world knowledge and...


12. unusual_whales (Group Score: 65.8 | Individual: 33.4)

Cluster: 2 tweets | Engagement: 15653 (Avg: 2758) | Type: Tech

BREAKING: Oracle, the software company headquartered in Austin, Texas, has filed thousands of petitions for H-1B visas in the past two fiscal years, even as it lays off thousands of American workers as part of a broader organizational shift, per NationalToday

See 1 related tweets

  • @aakashgupta: Oracle just mass-fired 30,000 people and filed 3,126 H-1B visa petitions in the same fiscal year.

T...


13. elonmusk (Group Score: 65.5 | Individual: 38.4)

Cluster: 2 tweets | Engagement: 95118 (Avg: 25379) | Type: Tech

My idea of a good time is working with amazing engineers to create incredible technology 🤩

The Tesla chip research fab will have all the machines needed to do logic, memory, packing & masks in one building for a lightning fast development cycle. Heaven 💫

See 1 related tweets

  • @niccruzpatane: Elon Musk and his teams find joy in making the impossible, possible.

Hence TERAFAB. It takes massiv...


14. heynavtoor (Group Score: 64.7 | Individual: 36.7)

Cluster: 2 tweets | Engagement: 1226 (Avg: 471) | Type: Tech

🚨 Someone reverse-engineered the design systems of Apple, Spotify, Airbnb, and 30+ billion-dollar companies.

Packed each one into a single file. Free.

It's called Awesome Design MD.

Drop one file into your project. Your AI agent builds UI that looks like Spotify. Or Apple. Or Airbnb. Instantly.

Not screenshots. Not Figma links. A single DESIGN .md file that captures every color, font, spacing value, button style, and layout pattern from a real website. In a format AI agents read and reproduce.

Here's the difference:

Tell Claude Code "build me a landing page" and it gives you generic UI.

Tell Claude Code "build me a landing page" with Spotify's DESIGN .md in your project and it gives you Spotify.

Here's what's inside:

→ Apple. Premium white space, SF Pro typography, cinematic imagery. → Spotify. Vibrant green on dark, bold type, album-art-driven layout. → Airbnb. Warm coral accent, photography-driven, rounded UI. → Linear. Ultra-minimal, precise spacing, purple accent. → SpaceX. Stark black and white, full-bleed imagery, futuristic. → BMW. Dark premium surfaces, precise German engineering aesthetic. → NVIDIA. Green-black energy, technical power aesthetic. → Uber. Bold black and white, tight type, urban energy. → Sentry, PostHog, Raycast, Cursor, ElevenLabs, and 20+ more.

Here's how to use it:

→ Pick a design system from the collection → Copy the DESIGN .md file into your project root → Tell your AI agent to use it → Get UI that matches the design language of a billion-dollar company

That's it. One file. Your AI agent now has the design taste of a $200/hour design consultant.

Designers charge 5,000+foracustomdesignsystem.Companiesspend5,000+ for a custom design system. Companies spend 50,000+ building one from scratch.

This is free. 31 design systems. Copy. Paste. Ship beautiful UI.

Works with Claude Code, Cursor, Codex, and any AI coding agent that reads project files.

100% Open Source. MIT License.

See 1 related tweets

  • @RoundtableSpace: SOMEONE PACKED THE DESIGN SYSTEMS OF APPLE, SPOTIFY, AIRBNB, AND 30+ BILLION DOLLAR COMPANIES INTO S...

15. aakashgupta (Group Score: 62.8 | Individual: 31.7)

Cluster: 2 tweets | Engagement: 35 (Avg: 104) | Type: Tech

The Anthropic team is rewriting the rules of product development. And paying the price for it in real time.

120 features. 90 days. Five product lines shipping simultaneously. Two new models. 1Bto1B to 19B in annualized revenue in 15 months. No product line waited for another.

But speed at this scale has visible costs. Claude's uptime has been rough. The Mythos model details leaked before the blog post went live. And last week someone forgot to add *.map to their .npmignore and shipped 512,000 lines of source code in a public npm package. The "safety-first AI lab" accidentally published an anti-distillation system, 44 unreleased feature flags, and an internal tool called "Undercover Mode" that strips Anthropic references from employee git commits.

That's what happens when you ship 120 things in 90 days. Some of them ship when they shouldn't.

The thing is, the product is getting better faster than the mistakes are accumulating. Agent Teams and Opus 4.6 shipped the same week. /loop assumes the model holds context for 14 hours. Connectors assume Cowork runs persistently. Each feature was designed knowing every other feature would exist. The dependency graph was mapped before the code was written.

The revenue says the market agrees. The leaks say the process has cracks. Both things are true. This is what operating with taste at speed actually looks like. Fast enough that the product compounds. Fast enough that the mistakes are public.

I went through every release and ranked them S through D. Six are game changers.

https://t.co/iaUH4GXIUc\n\nQT @aakashgupta: Anthropic shipped 120+ features in 90 days. I tested every one and ranked them S through D.

The S tier and what you need to know: https://t.co/kk4XZpYTVe https://t.co/CI3R8n3QWD

See 1 related tweets

  • @aakashgupta: Four engineers built Cowork in ten days.

They told agent swarms to create a spec, spin up an Asana ...


16. rickasaurus (Group Score: 62.8 | Individual: 25.6)

Cluster: 3 tweets | Engagement: 1 (Avg: 448) | Type: Tech

At this rate in just five years a million members of congress will be discussing AGI\n\nQT @peterwildeford: AI capabilities are doubling fast, but so is Congressional awareness of AI superintelligence and the risks. You can make a "METR graph" for AI policy and it shows an explosion... and it's bipartisan -> https://t.co/nmnrKzZ1f8

See 2 related tweets

  • @NathanpmYoung: RT @peterwildeford: AI capabilities are doubling fast, but so is Congressional awareness of AI super...
  • @WSJ: From @WSJopinion: AI is a threat to everything the American people hold dear. It kills jobs, equalit...

17. giffmana (Group Score: 62.3 | Individual: 32.6)

Cluster: 2 tweets | Engagement: 273 (Avg: 258) | Type: Tech

tfw you ablate ... checks notes ... literally everything, and it still works!\n\nQT @BoWang87: Apple Research just published something really interesting about post-training of coding models.

You don't need a better teacher. You don't need a verifier. You don't need RL.

A model can just… train on its own outputs. And get dramatically better.

Simple Self-Distillation (SSD): sample solutions from your model, don't filter them for correctness at all, fine-tune on the raw outputs. That's it.

Qwen3-30B-Instruct: 42.4% → 55.3% pass@1 on LiveCodeBench. +30% relative. On hard problems specifically, pass@5 goes from 31.1% → 54.1%.

Works across Qwen and Llama, at 4B, 8B, and 30B. One sample per prompt is enough. No execution environment. No reward model. No labels.

SSD sidesteps this by reshaping distributions in a context-dependent way — suppressing distractors at locks while keeping diversity alive at forks. The capability was already in the model. Fixed decoding just couldn't access it.

The implication: a lot of coding models are underperforming their own weights. Post-training on self-generated data isn't just a cheap trick — it's recovering latent capacity that greedy decoding leaves on the table.

paper: https://t.co/YsT3OSmbq3

code: https://t.co/OX58FzDVqy

See 1 related tweets

  • @dejavucoder: this title is funny especially if you havent heard about "embarassingly parallel" algorithms\n\nQT @...

18. MarioNawfal (Group Score: 61.0 | Individual: 26.2)

Cluster: 3 tweets | Engagement: 154 (Avg: 774) | Type: Tech

Grok Imagine isn’t “done.” It’s tuning itself weekly.

Tiny upgrades stacking. 2.0 training harder.

Audio about to level up. Faces locking in properly.

Patience. It’s building imagination muscle.

@Grok https://t.co/lXryFWzwN2\n\nQT @elonmusk: Small improvements to Imagine are happening frequently. Looks like we need another few weeks of training for Imagine 2.0, which will have major upgrades in speech/audio and face/details consistency.

See 2 related tweets

  • @MarioNawfal: RT @MarioNawfal: Grok Imagine isn’t “done.” It’s tuning itself weekly.

Tiny upgrades stacking. 2.0 ...

  • @MarioNawfal: Deadlines breathing down your neck? Chill.

Newly updated Grok Imagine doesn’t sweat.

Your only job...


19. AlexFinn (Group Score: 60.1 | Individual: 25.6)

Cluster: 3 tweets | Engagement: 2079 (Avg: 916) | Type: Tech

It’s over. Anthropic just banned OpenClaw.

Uncensored thoughts:

  1. Massive mistake that will come back to bite them

  2. Open source needs to win. If you have a local model running on your Mac mini, no corporation will ever be able to ban you

  3. ChatGPT 5.4 is the best model. But it sucks compared to opus in OpenClaw. I will continue to pay for Anthropic api

  4. I have no doubt the next OpenAI model will be optimized for Openclaw and be excellent

  5. In 6 months the local models will be as good as opus 4.6 and all of this will be forgotten

  6. It’s feels like from a consumer sentiment perspective things have flipped for OpenAI and Anthropic. They were the darlings when Opus 4.5 came out

  7. Going to the Kanye concert right now please don’t spoil the stage or set list in the replies

  8. The best openclaw set up is now Opus as the orchestrator, then much cheaper models as the execution layer. If you do this properly you won’t be paying much more than $200 a month. I’m using Gemma 4 and Qwen 3.5 for execution on my DGX Spark and Mac Studio\n\nQT @bcherny: Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw.

You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

See 2 related tweets

  • @AlexFinn: I told you so.

For months I’ve been telling you to buy Mac Minis Mac Studios and DGX Sparks

I tol...

  • @AlexFinn: RT @AlexFinn: It’s over. Anthropic just banned OpenClaw.

Uncensored thoughts:

  1. Massive mistake ...

20. MarioNawfal (Group Score: 59.6 | Individual: 31.8)

Cluster: 2 tweets | Engagement: 940 (Avg: 774) | Type: Tech

🇺🇸🇮🇷 A U.S. pilot is still stuck in Iran right now… and this is exactly what he was trained for years ago.

Every American fighter pilot goes through SERE. Survival, Evasion, Resistance, Escape. It’s a 19-day grind. Barely any sleep, constant pressure. About 6 days out in the forest dealing with heat, cold, hunger… learning shelter building, navigation, camouflage, and what to do after ejection. Basically how to stay alive and not get spotted.

This exists for a reason. In World War II, around 75% of downed Navy pilots survived the crash… but only 5% made it back. They didn’t know what to do after hitting the ground. SERE fixed that gap.

It runs in phases. Survive. Evade. Resist. Escape.

Resistance is the hardest part. Interrogation pressure, mind games, propaganda setups, forced labor simulations, mock captivity. The goal isn’t to be a hero… it’s to say less, remember less, give nothing useful.

And escape isn’t just running. Sometimes it’s waiting… knowing when not to move.

A lot of it is still classified. People who’ve done it don’t really talk about it.

One crew member already made it out. So yeah, the training works.

The second is still there.

And every hour that passes… that training is the only thing between him and a camera.

Source: USAF SERE, Support Our Troops, The National Interest\n\nQT @MarioNawfal: 🚨🇺🇸🇮🇷 Two U.S. aircraft were downed in Iran yesterday. One airman is still missing. Here's what probably happened:

The F-15E wasn't on a routine patrol. It was on a deep penetration strike into southwestern Iran. Kohgiluyeh province, mountainous, layered defenses, well inside hostile territory. High-value target. High risk.

Iran's air defenses (likely Bavar-373, repositioned mobile SAMs, or a surviving S-300 battery) finally scored one.

Both crew ejected successfully. That's where things got complicated.

They probably drifted apart. One landed somewhere accessible, picked up within hours by Pave Hawks under fire.

The other might have come down in harder terrain: valleys, vegetation, or too close to IRGC-patrolled ground. Still missing.

Then it escalated.

An A-10 scrambled to cover the rescue; low, slow, doing what Warthogs do. IRGC forces and militias rushed the area.

The A-10 took fire (MANPADS, AAA, possibly both) nursed itself to friendly airspace near the Gulf, and the pilot ejected safely.

One rescue mission, one additional aircraft hit. The CSAR itself became a battle.

The missing airman is almost certainly evading right now. SERE training, rough terrain, intermittent beacon.

Iran has already mobilized locals with bounties. IRGC has sealed the area. U.S. spec ops are almost certainly on the ground covertly.

Worst case: he's been captured, and Iran gets the propaganda win of the war.

People say this was air supremacy failing. It was actually one good Iranian shot triggering a cascading ground game that the U.S. is now racing to control.

Source: NBC News, CBS News, Reuters

See 1 related tweets

  • @MarioNawfal: 🇺🇸🇮🇷 What's happening right now inside Iran to find the missing pilot...

Somewhere in southern Iran...