科技推特精选 - 2026年4月22日

2026年4月22日科技每日简报

Today's top tech conversations are led by @markgurman, whose post about 'Apple picked Ternus because of...' garnered the highest engagement. Key themes trending across the top stories include https, model, models, training, agents. The community is actively discussing recent developments in AI, engineering practices, and startup strategies.

1. markgurman (Group Score: 462.6 | Individual: 36.4)

Cluster: 23 tweets | Engagement: 1200 (Avg: 1838) | Type: Tech

Apple picked Ternus because of his age and belief that he could reinvent Apple’s product lineup and compete against AI-savvy competitors. Longtime colleagues describe Ternus as someone willing to make clear calls, in contrast to Cook’s more deliberative, consensus-based approach.\n\nQT @markgurman: NEW: Apple bets on new CEO John Ternus to bring back some Jobs-era decisiveness. His mandate is to put Apple back on top as a product innovator. A look at what’s to come and his key challenges. https://t.co/SmzXpXy7AV

See 22 related tweets

@MarioNawfal: 🇺🇸 Apple just handed the keys to the hardware guy.

Tim Cook is stepping down as CEO on September 1,...

@WesRoth: After nearly 15 years at the helm, Tim Cook announced he is stepping down as Apple’s Chief Executive...
@cryptopunk7213: incredibly fucking bullish john ternus for apple ceo. could not have come at a better time:

he’s ...

@markgurman: New Apple CEO John Ternus to employees: “We are about to change the world once again” with an “incre...
@chandrarsrikant: RT @markgurman: Apple picked Ternus because of his age and belief that he could reinvent Apple’s pro...

2. TheRealAdamG (Group Score: 355.3 | Individual: 44.0)

Cluster: 14 tweets | Engagement: 1368 (Avg: 225) | Type: Tech

RT @OpenAI: Introducing ChatGPT Images 2.0

A state-of-the-art image model that can take on complex visual tasks and produce precise, immediately usable visuals, with sharper editing, richer layouts, and thinking-level intelligence.

Video made with ChatGPT Images https://t.co/3aWfXakrcR

See 13 related tweets

@gdb: really incredible what you're now able to create with a little bit of compute.

excited for new appl...

@BoWang87: A complete cell image by chatGPT Image 2.0 🔥 https://t.co/XlyDVdpcj3\n\nQT @OpenAI: Introducing Chat...
@JoshKale: ChatGPT Images 2.0 Is Mind Blowingly Great 🤯 The video below is OpenAI's blog post made entirely fro...
@Yuchenj_UW: Just tried gpt-image-2.

It is really good. OpenAI is finally leading the image gen again. https://t...

@anvisha: We too like chameleons at @trymoda 🦎\n\nQT @OpenAI: Introducing ChatGPT Images 2.0

A state-of-the-a...

3. zephyr_z9 (Group Score: 352.8 | Individual: 31.1)

Cluster: 13 tweets | Engagement: 60 (Avg: 108) | Type: Tech

WAT?? So Elon will buy Cursor\n\nQT @SpaceX: SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI.

The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models.

Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay$ 10 billion for our work together.

See 12 related tweets

@negligible_cap: *SPACEX SAYS HAS RIGHT TO ACQUIRE CURSOR FOR $60 BILLION

So seems like SpaceX effectively has a cal...

@Vtrivedy10: 1. what??\n\nQT @SpaceX: SpaceXAI and @cursor_ai are now working closely together to create the worl...
@twistartups: "Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay$ 10...
@petergyang: Pretty wild\n\nQT @SpaceX: SpaceXAI and @cursor_ai are now working closely together to create the wo...
@damianplayer: HOLY SHIT!

SpaceXAI might buy cursor by end of year for $60B

read the last line. https://t.co/GlB...

4. btibor91 (Group Score: 204.5 | Individual: 36.1)

Cluster: 7 tweets | Engagement: 1574 (Avg: 535) | Type: Tech

OpenAI is preparing Agents in ChatGPT (codename Hermes) including agents builder (studio), templates, schedules, option to use your agent in Slack, add apps, skills, files, memory, instructions and more

"Keep work moving 24/7 with agents"
"Start with a proven workflow - Pick a template and get your agent up and running in minutes"
"Build agents that reply in Slack - Add agents to Slack to handle common questions, without the back-and-forth or manual digging"
"Create agents tailored to how you work - Customize each agent with tools and skills, then schedule when it runs"

And a few other new changes including

"ImageGen likeness customization" (reference photo - this is the photo that ChatGPT refers to when you create an image of yourself) & "Images 2.0 Giveaway"
"Audio summary" (public-radio style recap/podcast, executive briefing, study guide, etc.)

See 6 related tweets

@petergostev: Can't believe OpenAI have acquired @garrytan's gstack\n\nQT @testingcatalog: OpenAI is working on a ...
@Shashikant86: Looks like even OpenAI loves Hermes Agents. This is different Hermes but Python stack is core of the...
@btibor91: https://t.co/VCuBVEvKI7\n\nQT @btibor91: OpenAI is preparing Agents in ChatGPT (codename Hermes) inc...
@RoundtableSpace: OPENAI MAY BE PREPARING A MUCH BIGGER AGENT LAYER INSIDE CHATGPT.

Templates, schedules, Slack use, ...

@AndrewCurran_: 'OpenAI is preparing Agents in ChatGPT (codename Hermes)' @Teknium\n\nQT @btibor91: OpenAI is prepar...

5. tunguz (Group Score: 192.1 | Individual: 38.2)

Cluster: 7 tweets | Engagement: 317 (Avg: 91) | Type: Tech

They are using their workers to train their AI and replace them. Utterly cringe and dystopian.\n\nQT @StockMKTNewz: Mark Zuckerberg and Meta Platforms $META just sent a memo to employees saying

Meta Platforms is installing a new tracking software on the computers of all employees in the United States 🇺🇸 so it can train its AI

Meta said the tracking tool will run on a list of work-related apps and websites

The tool will capture stuff like mouse movements, keystrokes and screenshots of what the employees are seeing on their screens - Reuters

See 6 related tweets

@jason: Studying of teams with AI is the trend of 2026

Study your workforce with apis, key loggers and ...

@wallstengine: $META IS NOW TRACKING EMPLOYEE WORKFLOWS FOR AI TRAINING

Reuters says Meta plans to install softwar...

@BoWang87: This could be the most influential paper in the history of AI 😅 https://t.co/ATjmsX7vBx\n\nQT @Stock...
@negligible_cap: $META is planning to install tracking software on US employee computers to capture their workflow da...
@Reuters: Exclusive: Meta is installing new tracking software on US-based employees' computers to capture mous...

6. Reuters (Group Score: 162.0 | Individual: 22.8)

Cluster: 9 tweets | Engagement: 69 (Avg: 108) | Type: Tech

Jeff Bezos' AI lab nears $38 billion valuation in funding deal, FT reports https://t.co/8fdM2NT2Z6 https://t.co/8fdM2NT2Z6

See 8 related tweets

@BusinessInsider: Project Prometheus, the secretive AI startup cofounded by Jeff Bezos, is raising around $10 billion ...
@business: Jeff Bezos is close to finalizing a $10 billion funding round for his AI startup that’s developing m...
@unusual_whales: Jeff Bezos is close to finalizing a $10 billion funding round for his AI startup that’s developing m...
@Cointelegraph: 🚨 JUST IN: Jeff Bezos's AI lab, Project Prometheus, is closing a $10B funding round at a$ 38B valuat...
@TeksEdge: Do you think Jeff Bezos' new AI startups and its $10 billion funding will get “Project Prometheus” o...

7. sarahmsachs (Group Score: 156.5 | Individual: 51.7)

Cluster: 5 tweets | Engagement: 848 (Avg: 140) | Type: Tech

When I saw our team's evals of Kimi 2.6, I thought "ok, things are gonna get interesting now".

This is the first open-weight model that plays like a top-class agentic model. Watching it go through ambiguous and meticulous chained tool work successfully puts it squarely in the wheelhouse of Opus 4.6. We're looking at an open weight model, but with much cheaper direct inference provider pricing. For a subclass of our eval set, it's outperforming GPT 5.2. We're about to undergo a gigantic industry shift.

Open weight is no longer for those who fine tune, those who want on-prem. It's an actual, reliable option for it's quality/price/latency profile for difficult agentic work.

It's not perfect. It's token hungry, relatively slow, and can get stuck in “thinking loops". But those are things we can engineer around. For value it is, and how it positions itself against major labs, this is a dramatic day for open weight models.

We sprinted as a team and worked closely with @FireworksAI_HQ to get this to our customers on day 0. No one should wait to try out a change like this. Try it yourself and tell me where it's working for you.\n\nQT @akothari: Kimi K2.6 just landed in @NotionHQ. Open‑weight, but absolutely a heavyweight.

Take it for a drive! 🏎️ https://t.co/ex07LDQkPG

See 4 related tweets

@FireworksAI_HQ: Nothing better than sprinting on a day-0 release alongside partners that run just as fast. Thank you...
@bgurley: Open models keep coming…\n\nQT @sarahmsachs: When I saw our team's evals of Kimi 2.6, I thought "ok,...
@TeksEdge: 🚀 Kimi K2.6 debuted #4 and is the #1 open-weights agent model

📊 Claw-Eval • Pass^3: 62.3% → #4 over...

@NotionHQ: RT @sarahmsachs: When I saw our team's evals of Kimi 2.6, I thought "ok, things are gonna get intere...

8. _akhaliq (Group Score: 153.7 | Individual: 60.5)

Cluster: 4 tweets | Engagement: 590 (Avg: 53) | Type: Tech

RT @akseljoonas: Introducing ml-intern, the agent that just automated the post-training team @huggingface

It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem.

It can pull off crazy things:

We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%.

In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%.

For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on https://t.co/udm7xGpNzR, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously.

How it works?

ml-intern makes full use of the HF ecosystem:

finds papers on arxiv and https://t.co/brvCC7fLPa, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on https://t.co/hrJuRkRyzi
browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data
launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains

ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like.

Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: https://t.co/l3K1PslZ1n Web + mobile: https://t.co/orko5srL4H

And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

See 3 related tweets

@omarsar0: Karpathy's autoresearch repo started an impressive trend.

Agents can now train AI models to build ...

@ClementDelangue: HF becoming the platform for agents (assisted by their humans) to use and build AI (rather than just...
@_lewtun: RT @Thom_Wolf: Love this work from Aksel and the post-training team at Hugging Face!

Turns out the ...

9. teortaxesTex (Group Score: 152.2 | Individual: 39.0)

Cluster: 5 tweets | Engagement: 113 (Avg: 58) | Type: Tech

All of Meta's enormous investments and painful course correction, and they can't even keep a lead over a Chinese startup tinkering with a nearly year-old base model. Still little has changed (Qwen-Max, GLM-5.2, V4 will also solidly exceed Muse Spark. Maybe even Step 3.6 Flash)\n\nQT @ArtificialAnlys: Moonshot’s Kimi K2.6 is the new leading open weights model. Kimi K2.6 lands at #4 on the Artificial Analysis Intelligence Index (54) behind only Anthropic, Google, and OpenAI (all 57)

Key takeaways:

➤ Increase in performance on agentic tasks: @Kimi_Moonshot's Kimi K2.6 achieves an Elo of 1520 on our GDPval-AA evaluation, which is a marked improvement over Kimi K2.5’s Elo of 1309. GDPval-AA is our leading metric for general agentic performance, measuring the performance on knowledge work tasks such as preparing presentations and analysis. Models are given code execution and web browsing tools in an agentic loop via our open source reference agentic harness called Stirrup. This continues Kimi K2.6’s strength in tool use, maintaining a 96% score on τ²-Bench Telecom, placing it among other frontier models in this category.

➤ Low hallucination rate: Kimi K2.5 scores 6 on the AA-Omniscience Index, our knowledge evaluation measuring both accuracy and hallucination rate. This score is primarily driven by a comparatively low hallucination rate of 39% (reduced from Kimi K2.5’s 65%), indicating a greater capability to abstain rather than fabricate knowledge when the model is uncertain. Kimi K2.6’s low hallucination rate places it similarly to other models such as Claude Opus 4.7 (36%) and MiniMax-M2.7 (34%)

➤ High token usage: Kimi K2.6 demonstrates high token usage, but is in line with other frontier models in the same intelligence tier. To run the full Artificial Analysis Intelligence Index, Kimi K2.6 used ~160M reasoning tokens. This is slightly lower than Claude Sonnet 4.6 (~190M reasoning tokens) but much higher than GPT 5.4 (~110M reasoning tokens).

➤ Open weights: Kimi K2.6 is a Mixture-of-Experts (MoE) model with 1T total parameters and 32B active, same as the previous two generations of models Kimi K2 Thinking and Kimi K2.5. Kimi K2.6 again pushes the open weights frontier in intelligence.

➤ Third Party Access: Kimi K2.6 is accessible through Moonshot’s First Party API as well as third party API providers Novita, Baseten, Fireworks, and Parasail

➤ Multimodality: Kimi K2.6 supports Image and Video input and text output natively. The model’s max context length remains 256k.

Further analysis in the threads below.

See 4 related tweets

@kimi_moonshot: Kimi is the current open-source SOTA on Artificial Analysis\n\nQT @ArtificialAnlys: Moonshot’s Kimi ...
@scaling01: How can you not love Kimi\n\nQT @ArtificialAnlys: Moonshot’s Kimi K2.6 is the new leading open weigh...
@nummanali: Kimi K2.6 #4! Only behind the Big 3

And it supports video input

Cursor Composer 2 is a RL’d Kimi ...

@theobearman: AA saying that a Chinese open source model is now more capable than Opus 4.6 Max (released 2.5 month...

10. rseroter (Group Score: 142.3 | Individual: 36.6)

Cluster: 5 tweets | Engagement: 198 (Avg: 41) | Type: Tech

RT @OfficialLoganK: Introducing our biggest upgrades to the Deep Research API yet... including Deep Research Max (our SOTA system), MCP support, Native charts & infographics, planning mode, full tool support (including Google tools), full multi-modal input support, & real-time progress streaming! https://t.co/bMbnCysqsC

See 4 related tweets

@teortaxesTex: insane BrowseComp score, this will be of real economic value Google has finally rendered Google Obso...
@patloeber: Big update for the Gemini Deep Research Agent!\n\nQT @OfficialLoganK: Introducing our biggest upgrad...
@testingcatalog: Google has launched an updated Deep Research and Deep Research Max in the Gemini API

> We are la...

@Google: RT @sundarpichai: We are launching two powerful updates to Deep Research in the Gemini API, now with...

11. ai_explorer25 (Group Score: 140.5 | Individual: 30.9)

Cluster: 5 tweets | Engagement: 3 (Avg: 165) | Type: Tech

OpenAI Chronicle treats memory like saved snapshots you can revisit later.

AirJelly builds memory around you. That shift isn’t just messaging, it changes how the product behaves.

One lets you go back and check what happened. The other helps the system pick up on who you are over time.\n\nQT @airjellyAI: 🚨The world just started talking about AI Screen Context.

We've been building something deeper — AirJelly can see your screen and remembers not just what you did, but who you are.

No intent missed. No task overdue.

🪼Introducing AirJelly today, the world's first context-aware proactive agent, that lives on your desktop, grows with your context, and proactively get works done.

See 4 related tweets

@Meer_AIIT: most AI agents get and collect memories from what you input.

so teams have been working on this co...

@Parul_Gautam7: so we really went from “AI that answers questions” to AI that just… does the work with you

underst...

@TheoBuildsAI: OpenAI Chronicle made one thing clear.

AI memory is finally becoming real.

But there’s a huge gap ...

@Origin_AI_01: OpenAI’s Chronicle is a great DVR, but @airjellyAI is a Chief of Staff.

One records your screen; t...

12. TheoBuildsAI (Group Score: 110.9 | Individual: 28.4)

Cluster: 4 tweets | Engagement: 68 (Avg: 285) | Type: Tech

Getting customers might not be the hardest part anymore\n\nQT @fin465: Introducing Origami. chat

The world’s first AI that finds you new customers.

1000+ companies use Origami for their outbound.

RT + reply with your website and we’ll send you 5 of your perfect customers right now👇 https://t.co/fX91tX3344

See 3 related tweets

@Origin_AI_01: In 2024, if you are manually building lead lists or spending $20k a year on data, you aren’t just be...
@heynavtoor: "Reply with your website and we'll send you 5 perfect customers."

That's either the most confident ...

@ycombinator: RT @fin465: Introducing Origami. chat

The world’s first AI that finds you new customers.

1000+ com...

13. FirstSquawk (Group Score: 109.2 | Individual: 24.2)

Cluster: 7 tweets | Engagement: 72 (Avg: 88) | Type: Tech

APPLE'S INCOMING CEO TERNUS: WE ARE ON THE VERGE OF TRANSFORMING THE WORLD ONCE MORE, WITH AI SET TO CREATE NEARLY BOUNDLESS POTENTIAL.

See 6 related tweets

@Forbes: New Apple CEO’s Big Challenge: Readying The $4 Trillion Behemoth For The AI Era https://t.co/jVtVvas...
@StockSavvyShay: Incoming $AAPL CEO John Ternus told employees he is “especially excited” to step into the role now b...
@wallstengine: Apple’s incoming CEO said AI creates “almost unlimited potential,” adding that “we are about to chan...
@Reuters: 🔊 ‘I think the big challenge for Apple right now is definitely in the software realm with things lik...
@WIRED: The soon-to-exit Apple CEO went all in on services. Now, the incoming CEO, John Ternus, will need to...

14. StockSavvyShay (Group Score: 105.6 | Individual: 31.9)

Cluster: 6 tweets | Engagement: 1134 (Avg: 442) | Type: Tech

$META is reportedly installing tracking software on U.S. employee computers to capture workflow data for AI training.

An internal memo says the tool will log mouse movements, keystrokes and screen snapshots. https://t.co/i45IXU0L18

See 5 related tweets

@Cointelegraph: 🚨 NEW: Meta to install tracking software on U.S. employee computers to collect workflow data for AI ...
@secureainow: RT @maggiemoda: Internal Meta memo exposed the company’s plans to install tracking software on emplo...
@Techmeme: Meta is installing tracking software on US staffers' computers to capture mouse movements, clicks, a...
@BusinessInsider: Meta sparks internal controversy with new AI training software that tracks employee keystrokes and m...
@unusual_whales: BREAKING: Meta, $META, is installing tracking software on its employee's computers to train its AI....

15. BrianRoemmele (Group Score: 104.5 | Individual: 37.8)

Cluster: 3 tweets | Engagement: 293 (Avg: 419) | Type: Tech

Boom!

A self confessed “blue collar junk dealer” just may save AI from model collapse.

I now have a pallet delivered with a freaking forklift of ~1600 pounds of forgotten Filmsort Microfiche punch cards!

This may be some of the most important and never seen data that likely cost billions to produce.

I am training AI models of off-line high protein data and this is clean pure protein.

But fear not, I have 2300 pounds pulse 1000s of more pounds donated to me from a liquidation warehouse.

No get this: the owner of this warehouse was so taken by what I am doing they saved, inventoried and paid for thier truck to drive 1000s of miles to deliver this pallet to me.

It took a self confessed “blue collar junk dealer” to save AI from model collapse and he said “I know nothing but even I get it”.

He doesn’t have the money to spend to help me, but he did and I love this guy.

Someday with their permission you will know his name, I will name something meaningful after him.

I love this guy and all his wisdom from decades of cleaning up after “brilliant” minds burn down companies and government departments.

He will not sell anything to anyone he even slightly thinks is in AI training, and I showed him how to know it. Some AI companies after reading what I am doing finally have hired smart but clueless folks to “call around”. Here is what my friend said “we are a small community that know each other, none of us will sell anything they can use to train AI in our inventories”. In fact he is leading a meetup of owners to talk about this and helping me more, perhaps 28 warehouses of data! Not a joke there is a lot more.

So it seems my decades dedication of being called crazy, and not “knowing anything about AI” just may have been a stupid thing smart folks did.

Perhaps let me strike that, it is smart do more AI training folks. Do more.

Thusly I am so blessed by the “blue collar junk dealer” and all of his friends around the world.

Thank you folks for your support, your subscriptions here, your coffees, your https://t.co/tcKeuiQyql memberships, your coin sponsorship, your random gifts, your words and your prayers keep me going. I just don’t know where yet, these headlights in my car only have 30 feet ahead but I know the compass setting.

1000s of pounds of data behind me and 1000s of pounds ahead of me…

I just am going to need a lot more time.\n\nQT @BrianRoemmele: Wow!

I now have secured over 1600 pounds of forgotten Filmsort Microfiche punch cards! We also have two warehouse locations with 1000s of pounds yet to be determined.

This 1600 pounds have more documents never digitized. Parts images but most importantly the work from billions of dollars of research. This is part of my weekend projects. I’ll fit into my schedule I am am moving to 20 cards scanned for every 20 seconds on my new light table and new rebuilt local DeepSeek OCR encoding.

Now I have to seriously find a way to distribute these as collector’s items.

Somehow I think I will be disowned if I carry around a few tons of punch cards from the 1960s.

See 2 related tweets

@BrianRoemmele: Fun fact. No single person in the government knows where all of our data is and was. Sure they know ...
@BrianRoemmele: RT @BrianRoemmele: Boom!

A self confessed “blue collar junk dealer” just may save AI from model col...

16. Shipper_now (Group Score: 100.3 | Individual: 38.1)

Cluster: 3 tweets | Engagement: 84 (Avg: 37) | Type: Tech

Introducing Shipper - AI employees that can do real COO work.

Over the last few months we’ve been obsessed with turning Claude agents into proper operators for our own team.

And it’s WORKING.

Shipper is now our most valuable teammate - it builds products, fixes bugs, manages ops, coordinates launches, tracks performance and a ton more.

Today we’re launching this publicly so everyone can create their own AI coworkers.

Each coworker (or Shipper) has its own computer, memory, tools and brain that runs on Claude Code Opus 4.7.

Shipper can work autonomously for hours, operates across your apps, infrastructure and systems, and gets smarter as you give it more context.

It’s not just a personal assistant - your whole team can rely on it from anywhere, and schedule it to run on autopilot.

And here’s the best part - we’ve already given it 100+ skills and connectors for real COO work: → build and ship products end-to-end → monitor and fix bugs automatically → manage infrastructure and deployments → analyze usage and decide what to do next → run onboarding and user flows → launch updates and improvements → manage internal tools and workflows → connect to the stack your business runs on

+++ lots more.

We’ve also connected everything directly into Shipper, so you don’t need to plug in any API keys or do anything else.

RT and comment “Shipper” as we're randomly giving out free credits.

See 2 related tweets

@chddaniel: Introducing Shipper 2.0 - AI coworkers that can do real COO work.

Over the last few months, we’ve b...

@Shipper_now: Software isn’t technical work anymore. It’s 100% creative.

Introducing Shipper. The first AI that c...

17. aiDotEngineer (Group Score: 99.8 | Individual: 36.3)

Cluster: 3 tweets | Engagement: 22 (Avg: 27) | Type: Tech

🆕Building Generative Image & Video models at Scale

https://t.co/cGGJVi5E1Y

A lot of interest in image gen recently! @sedielem is here to give a concise State of the Art overview of Generative Image and Video models, from Modeling and Architecture to Distillation and Control Signals!!

From his Research Science post at @GoogleDeepMind, his storied career from early AlphaGo to now leading research on Veo and @NanoBanana (and Gemini diffusion?), and one of the most beloved AI blogs in the world, we were honoured to host this special talk from Sander on home turf in London.

Enjoy this special presentation!\n\nQT @karpathy: Common Q: Can you train language model w diffusion? Favorite A: read this post (the whole blog is excellent)

(Roughly speaking state of the art generative AI is either trained autoregressively or with diffusion. The underlying neural net usually a Transformer.)

See 2 related tweets

@swyx: do not miss. one of the INSANE gets courtesy of @osanseviero and the @GoogleDeepMind london avengers...
@sedielem: RT @aiDotEngineer: 🆕Building Generative Image & Video models at Scale

https://t.co/cGGJVi5E1Y

A lo...

18. Mayhem4Markets (Group Score: 99.3 | Individual: 35.0)

Cluster: 3 tweets | Engagement: 5 (Avg: 62) | Type: Tech

Text is a lossy compression of reality. Anyone who has tried to describe a dream and felt the words dissolve on the way out knows this in their bones. Language was built for communication, not comprehension.

The people who built the original transformer models understood that -- and moved on from there.

Odyssey-2 Max is being framed as the GPT-2 moment for world models. The distinction that actually every AI researcher should be making is the one between video models and world models.

Sora makes clips. World models make simulations. One has a fixed ending. The other has no endpoint at all.

That's not a spec bump, that's a different species entirely.

The founding team came from Tesla FSD, Waymo, and Wayve. That pedigree isn't incidental. This is starting to look like pretrained physical intelligence. A system that spent years watching how the world actually moves before it ever tried to act in it.

Leading scores on physics benchmarks aren't about video quality, they're about understanding dynamics and human behavior at a level that was previously inaccessible.

The implications are where it gets interesting.

For robotics, the bottleneck was never hardware. It was training data. World models collapse that constraint entirely. Infinite, labeled, physics-accurate data on demand, edge cases included.

The moat for Tesla Optimus, Figure, and 1X rebases from "most real-world hours" to "best simulator." The robotics industry stops racing on data collection and starts racing on simulation fidelity. That rebase changes everything.

For the $200B games industry, world models prompt entire worlds into existence with physics baked in. The next Fortnite is a prompt, not a game. Every player gets a universe no one else will see, with NPCs that actually behave like agents.

Unreal and Unity are engines for a pre-world-model era. The level designers at Epic and Unity should be paying very close attention right now.

The team is calling this the GPT-2 moment for world models.

They're right.

The capability is real. Scaling works.

The next few versions will be something else entirely. 🦾\n\nQT @odysseyml: It’s time to go beyond language models.

Introducing Odyssey-2 Max, our most powerful world model yet. It materially advances the SOTA in physical accuracy.

This is a big step toward models that simulate and interact with the world in real time.

A new intelligence entirely! https://t.co/IT6ffhmLKL

See 2 related tweets

@ianmiles: I’ve been sitting here with the Odyssey-2 Max announcement open for the better part of an hour now, ...
@shiri_shh: Odyssey-2 max is basically ChatGPT but for the physical world

you type a prompt, it doesn't give yo...

19. garrytan (Group Score: 92.4 | Individual: 31.8)

Cluster: 3 tweets | Engagement: 86 (Avg: 272) | Type: Tech

CrabTrap is a big deal for the OpenClaw community\n\nQT @pedroh96: OpenClaw is the fastest-growing open source project, but there are no stories of running it safely in production at scale. As we started deploying agents internally at @brexHQ, we couldn’t stop thinking about this question.

Agents work, but nobody wants to give them real credentials. Instead of waiting for a solution to emerge, we decided to try a novel approach: using LLMs to judge the network traffic of an AI agent.

Today we’re announcing CrabTrap, an open-source proxy that intercepts every outbound request and blocks risky activity using LLMs, before it ever hits an external API. The results are promising; we believe it’s a meaningful step forward in the security of agent harnesses in production environments.

Try it out today.

(As a side note, it was really fun to work personally on a real systems problem again. And btw, if you want to work at a place where the CEO is building proxies at night, we’re hiring!)

See 2 related tweets

@snowmaker: Brex just open sourced the key piece of infrastructure that enabled them to run their whole company ...
@ycombinator: RT @pedroh96: OpenClaw is the fastest-growing open source project, but there are no stories of runni...

20. AlexFinn (Group Score: 92.0 | Individual: 45.4)

Cluster: 3 tweets | Engagement: 2151 (Avg: 510) | Type: Tech

It happened.

An open weights model just dropped that benchmarks higher than Opus 4.6 is out

If you have 2 Mac Studios w/ 512gb, you can run Opus 4.6 level intelligence completely for free on your desk

I warned you this would happen months ago. Now Mac Studios and Mac Minis are sold out

The next Mac Studio has been delayed until Q3/Q4. The price will be significantly higher

I told you this was going to happen. Intelligence explosion. Hardware bottleneck. Increased efficiency

Luckily I picked up 2 Mac Studio 512gbs, 2 Mac Minis, and a DGX Spark

I will be loading this up in the next couple of days and will have completely private super intelligence running for me 24/7

I’m telling you right now by end of year we will have a local version of Mythos. It’s 100% guaranteed

You called me crazy but every single prediction I’ve made has turned out to be true

These models will only get more efficient and require less hardware. But that hardware is only going to get more expensive

Local/open source is so obviously the future and if you’re still denying this now you are delusional\n\nQT @Kimi_Moonshot: Meet Kimi K2.6: Advancing Open-Source Coding

🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2)

What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops.
🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop.

K2.6 is now live on https://t.co/YutVbwktG0 in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: https://t.co/uvoSJKyGCY

🔗 API: https://t.co/EOZkbOwCN4 🔗 Tech blog: https://t.co/9wWvgIQSS3 🔗 Weights & code: https://t.co/Be0hjs2RTP

See 2 related tweets

@akshay_pachaar: Kimi K2.6 raises the bar for open-source models.

Moonshot released it yesterday, and for the first ...

@AlexFinn: RT @AlexFinn: It happened.

An open weights model just dropped that benchmarks higher than Opus 4.6 ...