今日科技推特精选 - 2026年3月4日

今日科技动态：谷歌发布 Gemini 3.1 Flash-Lite，进一步提升了 AI 效率；同时，ChatGPT-5.4 的泄露消息指向 200 万 token 的超大上下文窗口及持久记忆功能。工程工作流正迎来变革，Claude Code 新增了语音模式，LangChain 则将重心转向可靠的智能体生产。初创公司正加速转向自建智能体以及以 DuckDB 为核心的现代数据栈。在创新浪潮中，行业局势依然复杂，OpenAI 正在应对其“草率”的国防协议带来的舆论影响，而 X 平台则加强了对未标记 AI 生成冲突内容的限制。

1. kimmonismus (Group Score: 122.4 | Individual: 27.9)

Cluster: 7 tweets | Engagement: 300 (Avg: 445) | Type: Tech

Gemini 3.1 Flash lite released and its next level price-performance-ratio

Gemini 3.1 Flash-Lite, its fastest and most cost-efficient Gemini 3 model yet, built for high-volume developer workloads.

Priced at just $0.25 per 1M input tokens and$ 1.50 per 1M output tokens, it delivers 2.5x faster time to first token and 45% faster output speed than 2.5 Flash, while achieving a 1432 Elo score and up to 86.9% on GPQA Diamond benchmarks.

See 6 related tweets

@Google: RT @GoogleDeepMind: Gemini 3.1 Flash-Lite has landed.

It’s our most cost-efficient Gemini 3 series ...

@testingcatalog: BREAKING 🚨: Gemini 3.1 Flash Lite Preview is now available on Vertex AI!

"Designed for high-volume,...

@mark_k: Gemini 3.1 Flash-Lite was just released by @GoogleDeepMind, and the initial benchmarks are highly im...
@sundarpichai: Gemini 3.1 Flash-Lite is the fastest and most cost-efficient Gemini 3 series model⚡️

It outperforms...

@rseroter: Rolling out now! 🚀

It's priced at $0.25/1M input tokens and$ 1.50/1M output tokens. Outperforms Gem...

2. rohanpaul_ai (Group Score: 110.3 | Individual: 29.5)

Cluster: 6 tweets | Engagement: 141 (Avg: 97) | Type: Tech

Time to get rid of keyboard on Claude Code. Voice mode is rolling out now. ~5% of users today.

Press and hold the space bar to speak, then release to finish. Text appears at the cursor, and you can mix typing with voice.

Massive productivity multiplier https://t.co/tguml6OJNe

See 5 related tweets

@minchoi: oh snap... Claude Code just got voice mode.

Now you can talk to your codebase.

No extra cost. Don...

@AndrewCurran_: RT @trq212: Voice mode is rolling out now in Claude Code. It’s live for ~5% of users today, and will...
@RoundtableSpace: CLAUDE CODE VOICE MODE IS LIVE FOR 5% OF USERS TODAY, ROLLING OUT OVER THE NEXT FEW WEEKS

TYPE /VOI...

@TechCrunch: Claude Code rolls out a voice mode capability https://t.co/ae6NFFlDX7...
@rohanpaul_ai: RT @rohanpaul_ai: Time to get rid of keyboard on Claude Code. Voice mode is rolling out now. ~5% of ...

3. business (Group Score: 79.2 | Individual: 22.1)

Cluster: 5 tweets | Engagement: 291 (Avg: 125) | Type: Tech

OpenAI CEO Sam Altman says the company’s rush to forge a deal with the Defense Department — following a clash between the Pentagon and AI rival Anthropic — looked “opportunistic and sloppy" https://t.co/OYCc5sfJ3K

See 4 related tweets

@CNBC: OpenAI's Sam Altman admits ‘rushed’ deal with Defense Department after backlash https://t.co/KyPmvdF...
@Cointelegraph: 🇺🇸 JUST IN: OpenAI CEO Sam Altman admits the company's rushed Pentagon deal looked "opportunistic an...
@WSJ: OpenAI CEO Sam Altman defended his decision to allow the Pentagon to use its tools for classified wo...
@FT: OpenAI makes changes to ‘opportunistic and sloppy’ Pentagon deal https://t.co/ao8NFE39Em...

4. arena (Group Score: 76.1 | Individual: 24.9)

Cluster: 5 tweets | Engagement: 126 (Avg: 50) | Type: Tech

RT @JeffDean: ⚡ Excited to announce Gemini 3.1 Flash-Lite! We’ve set a new standard for efficiency and capability to give developers our fa…

See 4 related tweets

@yupp_ai: 📢 New Model Drop:

Gemini 3.1 Flash Lite Preview is now live on Yupp!

Available in thinking and sna...

@seconds_0: RT @OfficialLoganK: Introducing Gemini 3.1 Flash-Lite 🔦, a huge step forward on the boundary of inte...
@googledevs: RT @googleaidevs: Gemini 3.1 Flash-Lite is rolling out in preview via the Gemini API in @googleaistu...
@scaling01: Are you ready for Gemini-3.1-Flash-Lite?

well here's another PREVIEW https://t.co/io2E0Ec1Pc...

5. rohanpaul_ai (Group Score: 68.7 | Individual: 47.1)

Cluster: 2 tweets | Engagement: 409 (Avg: 97) | Type: Tech

ChatGPT-5.4 leaked news till now.

a massive 2M token context window and persistent memory features
full-resolution image processing. means the model can directly process highly detailed files in PNG, JPEG, and WebP formats to prevent any loss of crucial visual data. Preserving the original image bytes helps the system properly read complex graphics like detailed architectural drawings or high-density screenshots without missing small text.
a new priority speed tier for faster responses.

OpenAI has accidentally revealed GPT-5.4 - twice - through pull requests in its public Codex GitHub repository. Both references were quickly scrubbed via force pushes and edits, but not before screenshots circulated and the community noticed.

Prediction markets on Manifold give GPT-5.4 a 55% chance of shipping before April 2026 and 74% before June.

The competitive pressure is obvious. Claude Opus 4.6 launched with agent teams and a 1M context window. Anthropic's Claude Code dominates the coding market with 54% share. DeepSeek V4 is training on Huawei hardware outside the NVIDIA ecosystem entirely. OpenAI cannot afford to slow down.

See 1 related tweets

@techdaily24: OpenAI released GPT-5.3 Instant on March 3, 2026, as an update to ChatGPT's most-used model. This ne...

6. starbuxman (Group Score: 65.7 | Individual: 43.2)

Cluster: 2 tweets | Engagement: 786 (Avg: 102) | Type: Tech

RT @MillieMarconnni: 🚨 BREAKING: A developer on GitHub just built a tool that turns any GitHub repo into an interactive knowledge graph and…

See 1 related tweets

@RoundtableSpace: A DEVELOPER ON GITHUB JUST BUILT A TOOL THAT TURNS ANY GITHUB REPO INTO AN INTERACTIVE KNOWLEDGE GRA...

7. Forbes (Group Score: 65.7 | Individual: 65.7)

Cluster: 1 tweets | Engagement: 3011 (Avg: 115) | Type: Tech

If you’re planning on leaving ChatGPT behind for Claude or any other AI service, there’s some business you should take care of first, particularly if you want to ensure as much of the “memory" you’ve built up with ChatGPT is transferred to the new service.

Here are some tips on how you can minimize the disruption of leaving ChatGPT: https://t.co/IXzCeC0YGp 📸: Justin Sullivan via Getty Images

8. swyx (Group Score: 64.0 | Individual: 24.5)

Cluster: 3 tweets | Engagement: 188 (Avg: 211) | Type: Tech

RT @swyx: The best startup AI Engineers I've met are all building their own agents.

I know it's a buzzword that's now a bit past the wave…

See 2 related tweets

@BW: AI coding agents promised to make software development easier. Instead they’re pushing engineers to ...
@naval: RT @paultoo: Many startups are growing fast and creating real value by building workflows, adaptors,...

9. aakashgupta (Group Score: 63.6 | Individual: 31.1)

Cluster: 3 tweets | Engagement: 46 (Avg: 324) | Type: Tech

OpenAI’s release sequencing tells you everything about where they think the real battle is.

GPT-5.3 first showed up a month ago as Codex, a developer-only coding agent. Then Codex-Spark on Cerebras hardware. Today is the first time 5.3 touches a consumer chat window, and they shipped it as Instant, the lightweight everyday model. Thinking and Pro are still nowhere.

The order matters. They’re releasing 5.3 from the bottom of the stack upward. Developers got it in February. Free and paid ChatGPT users get Instant today. The reasoning models that compete directly with Claude Opus and Gemini Thinking? Still cooking.

And the lead marketing message is “less cringe.” They’re spending a major model release cycle telling users the AI will stop saying “Stop. Take a breath.” and won’t open every answer with a three-paragraph safety disclaimer.

The hallucination numbers are real: 26.8% reduction with web, 19.7% without. But they buried the benchmarks below the tone fixes. OpenAI knows that the marginal user doesn’t care about SWE-Bench scores. They care that the chatbot stopped sounding like a therapist who took one improv class.

This is OpenAI optimizing for retention, not capability. The users they’re losing aren’t leaving because GPT can’t reason. They’re leaving because every interaction feels like talking to an overcaffeinated life coach. Fixing the vibe is the product decision that moves DAUs.

The Thinking and Pro releases will come later with the benchmarks and the competitive comparisons. But shipping Instant first with “less cringe” as the headline tells you OpenAI’s biggest threat right now isn’t Claude or Gemini. It’s user churn from people who got tired of being patronized by their own chatbot.

See 2 related tweets

@mark_k: GPT-5.3 was released by @OpenAI and is rolling out right now to @ChatGPTapp.

Oddly enough, we only ...

@bindureddy: OpenAI says GPT 5.4 is coming soon

Pretty remarkable execution!

They are upgrading their models o...

10. danshipper (Group Score: 58.9 | Individual: 58.9)

Cluster: 1 tweets | Engagement: 753 (Avg: 39) | Type: Tech

we just wrote the ultimate beginner's guide to OpenClaw

almost everyone @every has one now, and they have completely changed the way we work and live. we're using our claws to:

build product
answer customer service queries
book hard-to-get restaurant reservations
track our reading notes

and much more

this is the guide we wish we'd had at the start: https://t.co/66n3Wz6MT0

11. kdnuggets (Group Score: 53.1 | Individual: 26.9)

Cluster: 2 tweets | Engagement: 2 (Avg: 2) | Type: Tech

Building Your Modern Data Analytics Stack with Python, Parquet, and DuckDB - Modern data analytics doesn’t have to be complex. Learn how Python, Parquet, and DuckDB work together in practice. #DataScience Read more at: https://t.co/qJ6hEea1HC https://t.co/OdMcysjZEH

See 1 related tweets

@kdnuggets: Building Your Modern Data Analytics Stack with Python, Parquet, and DuckDB #DataScience Read more he...

12. aakashgupta (Group Score: 52.5 | Individual: 31.5)

Cluster: 2 tweets | Engagement: 75 (Avg: 324) | Type: Tech

The Qwen 3.5 small model hype is getting ahead of itself.

Yes, the 9B beats GPT-5 Nano by 13 points on MMMU-Pro (70.1 vs 57.2) and 30+ points on document understanding. Yes, it outperforms Qwen’s own previous-gen 30B on most benchmarks at a third the size. The bar charts look incredible.

Bar charts always look incredible. That’s what they’re designed to do.

The gap between “tops a benchmark leaderboard” and “works reliably when a real user sends a messy query with ambiguous instructions and expects tool calls to execute correctly” is where open-weight small models have historically collapsed. Instruction following edge cases, hallucination rates under adversarial inputs, tool-calling reliability when the schema gets complex. Nobody posts those bar charts.

What’s actually real here: the architecture. Gated DeltaNet hybrid attention with a 3:1 linear-to-full ratio, native multimodal pretraining, scaled RL. A 9B model beating a 30B predecessor means the architectural gains are compounding faster than parameter scaling. That’s engineering worth paying attention to.

The 4B running text, images, and video from 8GB of VRAM is also real. A year ago that required 13B+ and a serious GPU. The hardware floor for on-device multimodal AI just dropped significantly.

But “Apache 2.0 and free to download” does not equal “ready to replace GPT-5 Nano in production.” OpenAI still controls enterprise distribution, developer trust, reliability at scale, and the ecosystem most teams default to when they need something that works Monday morning. Benchmarks don’t flip purchasing decisions. Uptime and support contracts do.

The thing worth tracking is iteration speed. Sixteen days from 397B flagship to four small models. Nine models in two weeks. That pace of propagating architectural gains from frontier to edge is genuinely unusual. Whether the production quality matches the benchmark quality at each tier is the question nobody hyping this release is asking.

See 1 related tweets

@burkov: I don't trust Qwen models, even the large ones. I trust small ones even less.

Qwen has been benchma...

13. TechCrunch (Group Score: 52.2 | Individual: 17.8)

Cluster: 3 tweets | Engagement: 56 (Avg: 114) | Type: Tech

X says it will suspend creators from revenue-sharing program for unlabeled AI posts of ‘armed conflict’ https://t.co/DLRFm2q4GR

See 2 related tweets

@nummanali: Very welcome policy - we must be able to trust X as the source if it is to thrive

"Starting now, u...

@techdaily24: X will suspend creators from its revenue-sharing program for 90 days if they post unlabeled AI-gener...

14. LangChain (Group Score: 49.4 | Individual: 29.8)

Cluster: 2 tweets | Engagement: 246 (Avg: 182) | Type: Tech

🚀 New LangChain Academy Course: Building Reliable Agents 🚀

Shipping agents to production is hard. Traditional software is deterministic – when something breaks, you check the logs and fix the code. But agents rely on non-deterministic models.

Add multi-step reasoning, tool use, and real user traffic, and building reliable agents becomes far more complex than traditional system design.

The goal of this course is to teach you how to take an agent from first run to production-ready system through iterative cycles of improvement.

You’ll learn how to do this with LangSmith, our agent engineering platform for observing, evaluating, and deploying agents.

Enroll for free ➡️ https://t.co/fok9ahY6MG

See 1 related tweets

@hwchase17: RT @LangChain: 🚀 New LangChain Academy Course: Building Reliable Agents 🚀

Shipping agents to produc...

15. chddaniel (Group Score: 47.4 | Individual: 24.2)

Cluster: 2 tweets | Engagement: 11 (Avg: 14) | Type: Tech

My goodness... I could use Claude Code all day.

CC Opus 4.6 is great at design and sets up backend + auth with in one prompt. So excited for @shipper_now 2.0 to launch this week!!! https://t.co/BXqH7Yqzam

See 1 related tweets

@Shipper_now: oh damn... I could use Claude Code all day.

Opus 4.6 is soooo at design and sets up backends with i...

16. swyx (Group Score: 44.7 | Individual: 44.7)

Cluster: 1 tweets | Engagement: 1244 (Avg: 211) | Type: Tech

this is the Final Boss of Agentic Engineering:

killing the Code Review

at this point multiple people are already weighing how to remove the human code review bottleneck from agents becoming fully productive. @ankitxg was brave enough to map out how he sees SDLC being turned on its head.

i'm not personally there yet, but I tend to be 3-6 months behind these people and yeah its definitely coming.

17. mark_k (Group Score: 42.4 | Individual: 25.9)

Cluster: 2 tweets | Engagement: 209 (Avg: 214) | Type: Tech

OpenAI definitely has a PR problem right now.

I could be wrong, but I bet there is also some internal turmoil going on, as evidenced by the release chaos with GPT-5.3 and 5.4.

Quo vadis, @OpenAI? https://t.co/WGA5QSFHPL

See 1 related tweets

@testingcatalog: OpenAI confirmed earlier leaks that we are getting GPT-5.4 as a next model version.

Sooner but not...

18. nvidianewsroom (Group Score: 42.3 | Individual: 17.8)

Cluster: 3 tweets | Engagement: 97 (Avg: 336) | Type: Tech

NVIDIA and @Lumentum are working together to advance sophisticated silicon photonics to build the next generation of gigawatt-scale AI factories. https://t.co/EkVp7b3aYc

See 2 related tweets

@nvidianewsroom: NVIDIA is partnering with @CoherentCorp to pioneer next-generation silicon photonics to enable AI in...
@nvidia: RT @nvidianewsroom: NVIDIA is partnering with @CoherentCorp to pioneer next-generation silicon photo...

19. Cloudflare (Group Score: 42.0 | Individual: 24.8)

Cluster: 2 tweets | Engagement: 37 (Avg: 109) | Type: Tech

Cloudy is our LLM-powered explanation layer built directly into Cloudflare One. Its explanations, now part of Phishnet and API CASB, can improve user decisions and SOC efficiency. https://t.co/VeHGVZYadm

See 1 related tweets

@eastdakota: RT @Cloudflare: Cloudy is our LLM-powered explanation layer built directly into Cloudflare One. Its ...

20. zerohedge (Group Score: 41.9 | Individual: 41.9)

Cluster: 1 tweets | Engagement: 19118 (Avg: 1499) | Type: Tech

RT @SharifiZarchi: In the past few hours, hundreds of Iranian university professors and tech experts have signed a statement declaring the…