Open-Source

LLM Wire Format Benchmark: Which Format Can AI Actually Read and Write? 2026-06-06

23 comprehension runs across 10 models (Claude Opus/Sonnet/Haiku, GPT-5.5/5.4/5.4-mini, Gemini 2.5 Flash/Pro, Gemini 3.1 Pro, Gemini 3.5 Flash). Generation eval across 11 models and 3 providers (Anthropic, OpenAI, Google). GCF wins 22, ties 1, loses 0 on comprehension. GCF achieves 5/5 valid generation on every frontier model with zero prior training. TOON fails 0/5 on generation with Opus, GPT-5.4, GPT-5.4-mini, Gemini 3.1 Pro, and Gemini 3.1 Flash Lite. JSON breaks on input at 500 symbols.

We Ran TOON's Own Benchmark. GCF Won. 2026-06-03

We inserted GCF into TOON’s benchmark harness. Same datasets, same tokenizer, same methodology. GCF uses 34% fewer tokens on mixed-structure data, matches TOON on flat data, and achieves 100% LLM comprehension accuracy where JSON fails at 66.7%.

We Scanned 300 npm and PyPI Packages for Supply Chain Attacks Without Executing a Single Line of Code 2026-06-03

We indexed 300 popular packages with knowing’s code graph, computed isolation scores based on credential access + process spawning patterns, and achieved a 1.0% false positive rate across both the initial 200 and a held-out 100. No sandbox. No execution. No heuristics. Just graph structure.

14,000 Python Developers Installed My Go Binary via pip. Here's How. 2026-05-27

Your Go CLI tool is on GitHub Releases. 80% of developers will never find it there. Here’s how to put it on pip and npm with 50 lines of bash, getting a 12x download multiplier. Full technique with scripts, numbers, and the release pipeline that ties it together.

Your AI Agent's Code Search Hits 2% of the Time. We Benchmarked It. 2026-05-22

Rigorous benchmark of AI agent code retrieval: 107 tasks, 5 repos, 5 languages, 4 competitors. grep precision: 2%. GitNexus: 7.6%. knowing: 23% (11.5x better, p<0.0001). Plus: 193x faster indexing, 28x less RAM, 48x more token-efficient than Repomix. The first statistically validated comparison of code intelligence tools for AI agents.

The Code Intelligence Landscape: Context, Memory, and Proofs 2026-05-20

AI coding agents have a context problem. The tools solving it fall into four categories: context packers, code graphs, memory systems, and runtime observability. Each solves one piece. None versions the intelligence. None proves anything. None learns without poisoning itself over time. This article explores the landscape and argues that content-addressed code graphs with cryptographic proofs are the missing foundation.

What Git Did for Files, Applied to Code Relationships 2026-05-20

Git proved that content-addressing file contents gives you integrity, history, efficient equality, and distributed collaboration for free. The same architecture applied to code relationships gives you something new: versioned intelligence that you can diff, cache, prove, and trust over time.

We Measured It: LSP Saves AI Agents 5-34x Tokens vs Grep 2026-05-03

We built a reproducible experiment measuring how many tokens AI coding agents consume when navigating code with grep vs LSP. On HashiCorp Consul (319K lines), LSP uses 34x fewer tokens. On a TypeScript rename across 24 files: 1,441x fewer bytes. The experiment covers 4 codebases, 3 languages, 13 tasks covering 7 agent workflows.

We Tested 55 MCP Servers. Here's What Breaks. 2026-04-27

MCP servers are the tools AI agents rely on. We tested 55 of them with mcp-assert, found 20 bugs across 9 servers, and submitted fix PRs. Grafana and Ant Group merged ours. Three days after launch, Ant Group’s visualization team asked us to integrate mcp-assert into their CI. The most common failure: servers throw unhandled exceptions instead of returning isError, leaving agents unable to recover.

agent-lsp: Reliable Code Intelligence for AI Agents via MCP and LSP 2026-04-15

I needed AI agents to reliably rename symbols, find references, and check diagnostics without silent failures. The existing MCP-LSP tools were stateless, feature-poor, and untested. So I built agent-lsp: a persistent runtime with 50 tools, 20 provider-agnostic skills, speculative execution, and an audit trail for every AI-driven edit.

Branding a CLI Tool in 4 Days: Mascot, Screencasts, and Visual Identity with AI 2026-02-24

Most CLI tools ship with no visual identity beyond a help screen. Here’s how I used AI image generation to create Shelby, a consistent mascot with a locked-down spec, and built a complete brand system - poses, screencasts, color palette, terminal theme - for shelfctl in 4 days.

Artifact-Boundary Productization: Clean OSS/Commercial Separation 2026-01-28

The execution boundary determines everything: features that need the system alive belong in the platform (OSS). Features that analyze artifacts after shutdown become the product (commercial). A framework for clean OSS/commercial separation.

GPL & AGPL: Freedom Through Copyleft - Complete Guide to Viral Licensing 2026-01-10

Why copyleft licenses ‘infect’ derivative works, how GPL differs from permissive licenses, and when viral licensing protects community contributions from proprietary capture

Apache License 2.0: When Patent Protection Matters - Complete Guide 2025-12-31

Why Apache 2.0 matters for patent-heavy projects, how it differs from MIT, and when explicit patent grants protect your users and contributors

Why Choose the MIT License? A Comprehensive Guide to Open Source Licensing 2025-12-29

Why MIT became the most popular open-source license, when to choose it over GPL/Apache/BSD, and a decision framework for selecting the right license for your project

Your README is a Landing Page, Not Your Documentation 2025-12-25

More features always lead to more sprawl. The longer it goes on, the harder it is to bring back under control. Here’s how to treat your README like a landing page - with hooks, not walls of text.