Jun 28, 2026 · Shingo Nakamura · AI
Headroom sits between your AI agent and the model, compressing tool outputs, logs, RAG chunks, files and history before they cost you tokens — reversibly, and locally. What it is, how the router-plus-compressors design works, what its self-reported benchmarks actually show, and the honest costs.
claude-codellmtokenscontext
read more →
Jun 27, 2026 · Shingo Nakamura · AI
Ponytail is a one-file 'lazy senior dev' skill that makes coding agents stop and pick the simplest solution that works — cutting code by ~54% (and tokens ~22%) on a real agentic benchmark. What it is, how the ladder works, what trusted reviewers found, and the honest caveats.
claude-codellmtokensskills
read more →
Jun 27, 2026 · Shingo Nakamura · AI
Ralph is a brutally simple technique — loop the same prompt into a coding agent until the task is done. What it is, how the two main implementations (Anthropic's ralph-wiggum plugin and snarktank/ralph) actually differ, the anecdotal numbers behind the hype, and an honest critical review of where it breaks.
claude-codellmagentsautonomous
read more →
Jun 21, 2026 · Shingo Nakamura · AI , Python
LiteLLM is an open-source AI gateway that gives you a single OpenAI-format interface to 100+ LLM providers — as a Python SDK or a self-hosted proxy. What it is, how it works, two real use cases, how it compares, its performance, and its honest pros and cons.
litellmgatewayllmproxy
read more →
Jun 21, 2026 · Shingo Nakamura · AI
pi is a tiny, aggressively-extensible terminal coding harness. What it is, how to install and use it, how it compares to opencode and Claude Code, and what the benchmarks actually say.
coding-agentclillmagents
read more →
Jun 20, 2026 · Shingo Nakamura · AI , Python
A practical look at LangChain — what it's for, how to install and use it, what you can build, whether you need to deploy it, how it stacks up against Google ADK and other agent frameworks, and its real downsides.
langchainagentsllmrag
read more →
Jun 2, 2026 · Shingo Nakamura · AI
A plain-English explainer of the AI harness — the code around a model that lets it use tools, take steps, and get work done. What it is, why it matters, how it works, and the two senses of the word (agent harness vs evaluation harness).
harnessagentsllmtool-use
read more →
May 15, 2026 · Shingo Nakamura · AI
Andrej Karpathy's LLM Wiki pattern — let an LLM build and maintain a persistent, interlinked wiki from your sources, so knowledge compounds instead of being re-derived on every query. What it is, how it works, how it compares to RAG, and the token economics.
llm-wikiragknowledge-managementagents
read more →
May 1, 2026 · Shingo Nakamura · AI
A Claude Code skill that makes the agent talk like a caveman — cutting output tokens by up to ~75% while keeping full technical accuracy. Why, how it works, how to install it, and real before/after numbers.
claude-codellmtokensproductivity
read more →