- 08:16 — Updated CLAUDE.md repo with new features including spec-based development and self-improvement loop. @iamfakeguru
- 08:26 — A $10 billion AI startup faced a security breach due to developers handing production credentials to a chatbot. @aakashgupta
- 15:13 — Progress on Hermes’ agent with a focus on auditing and verifying bugs. @gkisokay
- 15:47 — Release of Trinity-Large-Thinking on the Arcee API with open weights. @arcee_ai
- 17:18 — Proposal for integrating efficient KV cache compaction into pretraining. @part_harry_
- 22:17 — Demo of @PrismML’s Bonsai 8B model showing significant memory efficiency. @AnythingLLM
- 22:56 — New CS2 update includes animations for players pulling out knives. @Ozzny_CS2
AI & Tech Developments - Apr 02
📱 Source Tweets
我现在使用 Claude Code 时候的 env 配置
— @discountifu
The most complex/important historical question is why White men invented everything. I offer a new explanation in Greatness and Ruin (2025). Toilets: 1596 Shower: 1810s Syllogistic logic: 300s BC Printing press: 1440 Electricity: 1750s Calculus: 1680s Optical Lenses: 1200s
— @dr_duchesne
Pretraining is data-inefficient. This is entirely a consequence of the fact that we throw away the KV cache after every forward-backward step! If we can integrate efficient KV cache compaction into pretraining, we will unlock human level data efficiency. Neural KV cache
— @part_harry_
n8n shipped native MCP support and nobody's connecting what this actually means... your AI agents can now CREATE and MODIFY n8n automations programmatically
— @EXM7777
GLM 5V Turbo just hit #5 on BridgeBench SpeedBench. 221.2 tokens per second. http://Z.ai's multimodal agent model is faster than Gemini 3.1 Pro, Claude Sonnet 4.6, Claude Opus 4.6, and GPT 5.4.
— @bridgebench
BREAKING: Someone just dropped the most advanced Steganography Platform EVER!! http://STE.GG is an open-source toolkit that hides secrets inside ANYTHING! images, audio, text, PDFs, network packets, ZIP archives, and even emojis
— @elder_plinius
Claude Code 2.1.90 has been released. 19 CLI changes Highlights: • Added /powerup interactive lessons with animated demos to speed hands-on Claude Code onboarding • Auto mode respects explicit boundaries like 'don't push' or 'wait for X before Y', avoiding unintended actions
— @ClaudeCodeLog
MLX + TurboQuant + Apple's M5 Max = Some more details for you on the @Prince_Canuma MLX port of TurboQuant running on Apple's M5 Max.
— @jtdavies
Running Qwen3.5-122B-A10B at 84 tok/s on a 4090+3090 using hybrid K q8_0 and V TurboQuant turbo3 some good looking results!!
— @basecampbernie
More progress with Kimi-K2.5 (1T params model) On M5 Max with 128 GB Flash SSD, streaming. Now reaching 6 t/s after fusing Flash-SSD kernels.
— @anemll
I updated the CLAUDE.md repo based on what I found digging deeper into the source + patterns from Anthropic's team. Added: spec-based development, sub-agent execution, prompt cache, and a self-improvement loop that makes the agent stop repeating mistakes. Breakdown below.
— @iamfakeguru
decided to rent 8x H100 to train qwen3-coder-next with an opus4.6 reasoning dataset (the same used for the qwen3.5 distillation). let see how it goes!
— @CardilloSamuel
This was the result of the first self-improvement run on my Hermes' agent. After migrating all my workflows to Hermes, the last step was to shift OpenClaw's focus to audit and verify bugs with my Hermes Supervisor agent. Since this migration required code refactoring, I'm not
— @gkisokay
Today we're releasing Trinity-Large-Thinking. Available now on the Arcee API, with open weights on Hugging Face under Apache 2.0. We built it for developers and enterprises that want models they can inspect, post-train, host, distill, and own.
— @arcee_ai
Excited to share what we have been working on recently! This is our first step towards truly infinite context windows. Summarization based methods force information through a token bottleneck. Gradient based approaches like Cartridges require significant inference time compute.
— @part_harry_
Today, we demoed @PrismML's Bonsai 8B model in @AnythingLLM desktop to see if the 1-bit mode arch is the revolution local AI needs. Turns out, this is not an April fools joke. 1.1GB memory footprint for a whole 8B model with high accuracy
— @AnythingLLM
NEW CS2 Update: You can see Animations when other players pull out Knives
— @Ozzny_CS2