ai Friday, April 3, 2026

AI & Tech Developments - Apr 02

08:16 — Updated CLAUDE.md repo with new features including spec-based development and self-improvement loop. @iamfakeguru
08:26 — A $10 billion AI startup faced a security breach due to developers handing production credentials to a chatbot. @aakashgupta
15:13 — Progress on Hermes’ agent with a focus on auditing and verifying bugs. @gkisokay
15:47 — Release of Trinity-Large-Thinking on the Arcee API with open weights. @arcee_ai
17:18 — Proposal for integrating efficient KV cache compaction into pretraining. @part_harry_
22:17 — Demo of @PrismML’s Bonsai 8B model showing significant memory efficiency. @AnythingLLM
22:56 — New CS2 update includes animations for players pulling out knives. @Ozzny_CS2

📱 Source Tweets

我现在使用 Claude Code 时候的 env 配置
— @discountifu

The most complex/important historical question is why White men invented everything. I offer a new explanation in Greatness and Ruin (2025). Toilets: 1596 Shower: 1810s Syllogistic logic: 300s BC Printing press: 1440 Electricity: 1750s Calculus: 1680s Optical Lenses: 1200s
— @dr_duchesne

Pretraining is data-inefficient. This is entirely a consequence of the fact that we throw away the KV cache after every forward-backward step! If we can integrate efficient KV cache compaction into pretraining, we will unlock human level data efficiency. Neural KV cache
— @part_harry_

n8n shipped native MCP support and nobody's connecting what this actually means... your AI agents can now CREATE and MODIFY n8n automations programmatically
— @EXM7777

GLM 5V Turbo just hit #5 on BridgeBench SpeedBench. 221.2 tokens per second. http://Z.ai's multimodal agent model is faster than Gemini 3.1 Pro, Claude Sonnet 4.6, Claude Opus 4.6, and GPT 5.4.
— @bridgebench

BREAKING: Someone just dropped the most advanced Steganography Platform EVER!! http://STE.GG is an open-source toolkit that hides secrets inside ANYTHING! images, audio, text, PDFs, network packets, ZIP archives, and even emojis
— @elder_plinius

Claude Code 2.1.90 has been released. 19 CLI changes Highlights: • Added /powerup interactive lessons with animated demos to speed hands-on Claude Code onboarding • Auto mode respects explicit boundaries like 'don't push' or 'wait for X before Y', avoiding unintended actions
— @ClaudeCodeLog

MLX + TurboQuant + Apple's M5 Max = Some more details for you on the @Prince_Canuma MLX port of TurboQuant running on Apple's M5 Max.
— @jtdavies

Running Qwen3.5-122B-A10B at 84 tok/s on a 4090+3090 using hybrid K q8_0 and V TurboQuant turbo3 some good looking results!!
— @basecampbernie

More progress with Kimi-K2.5 (1T params model) On M5 Max with 128 GB Flash SSD, streaming. Now reaching 6 t/s after fusing Flash-SSD kernels.
— @anemll

I updated the CLAUDE.md repo based on what I found digging deeper into the source + patterns from Anthropic's team. Added: spec-based development, sub-agent execution, prompt cache, and a self-improvement loop that makes the agent stop repeating mistakes. Breakdown below.
— @iamfakeguru

decided to rent 8x H100 to train qwen3-coder-next with an opus4.6 reasoning dataset (the same used for the qwen3.5 distillation). let see how it goes!
— @CardilloSamuel

This was the result of the first self-improvement run on my Hermes' agent. After migrating all my workflows to Hermes, the last step was to shift OpenClaw's focus to audit and verify bugs with my Hermes Supervisor agent. Since this migration required code refactoring, I'm not
— @gkisokay

Today we're releasing Trinity-Large-Thinking. Available now on the Arcee API, with open weights on Hugging Face under Apache 2.0. We built it for developers and enterprises that want models they can inspect, post-train, host, distill, and own.
— @arcee_ai

Excited to share what we have been working on recently! This is our first step towards truly infinite context windows. Summarization based methods force information through a token bottleneck. Gradient based approaches like Cartridges require significant inference time compute.
— @part_harry_

Today, we demoed @PrismML's Bonsai 8B model in @AnythingLLM desktop to see if the 1-bit mode arch is the revolution local AI needs. Turns out, this is not an April fools joke. 1.1GB memory footprint for a whole 8B model with high accuracy
— @AnythingLLM

NEW CS2 Update: You can see Animations when other players pull out Knives
— @Ozzny_CS2