Ai-Agents

We Measured It: LSP Saves AI Agents 5-34x Tokens vs Grep 2026-05-03

We built a reproducible experiment measuring how many tokens AI coding agents consume when navigating code with grep vs LSP. On HashiCorp Consul (319K lines), LSP uses 34x fewer tokens. On a TypeScript rename across 24 files: 1,441x fewer bytes. The experiment covers 4 codebases, 3 languages, 13 tasks covering 7 agent workflows.

We Tested 55 MCP Servers. Here's What Breaks. 2026-04-27

MCP servers are the tools AI agents rely on. We tested 55 of them with mcp-assert, found 20 bugs across 9 servers, and submitted fix PRs. Grafana and Ant Group merged ours. Three days after launch, Ant Group’s visualization team asked us to integrate mcp-assert into their CI. The most common failure: servers throw unhandled exceptions instead of returning isError, leaving agents unable to recover.

agent-lsp: Reliable Code Intelligence for AI Agents via MCP and LSP 2026-04-15

I needed AI agents to reliably rename symbols, find references, and check diagnostics without silent failures. The existing MCP-LSP tools were stateless, feature-poor, and untested. So I built agent-lsp: a persistent runtime with 50 tools, 20 provider-agnostic skills, speculative execution, and an audit trail for every AI-driven edit.

The Agent-Skill Boundary: When Autonomous Behaviors Become Skills 2026-03-29

Agents accumulate autonomous behaviors over time - ‘always do X before Y’, ‘if you see Z then do W’. These instructions eat context budget, drift across invocations, and can’t be observed or tested. How to recognize when an autonomous behavior is a skill waiting to be extracted, and the layered model that makes the boundary clear.

Self-Validating Agents: Building Quality Checks into Claude Code Workflows 2026-03-24

Claude Code agents write code fast. Too fast to catch quality issues in real-time. Here’s how to build validation directly into agent workflows using hooks and team coordination - micro validation after every file write, macro validation before completion, and independent review from validator agents.