gcf
toon
json
llm
benchmark
wire-format
token-efficiency
ai-agents
mcp
claude
gpt
gemini
open-source
23 comprehension runs across 10 models (Claude Opus/Sonnet/Haiku, GPT-5.5/5.4/5.4-mini, Gemini 2.5 Flash/Pro, Gemini 3.1 Pro, Gemini 3.5 Flash). Generation eval across 11 models and 3 providers (Anthropic, OpenAI, Google). GCF wins 22, ties 1, loses 0 on comprehension. GCF achieves 5/5 valid generation on every frontier model with zero prior training. TOON fails 0/5 on generation with Opus, GPT-5.4, GPT-5.4-mini, Gemini 3.1 Pro, and Gemini 3.1 Flash Lite. JSON breaks on input at 500 symbols.
ai
mcp
code-intelligence
benchmark
knowledge-graph
retrieval
precision
codegraph
aider
knowing
developer-tools
codegraph has 19K GitHub stars. GitNexus has 40K. Aider has 20K. We benchmarked 7 systems on 302 tasks across 17 codebases, 8 languages. knowing is 3.79x more precise than codegraph, 6.00x vs GitNexus, 6.35x vs Gortex, 22.0x vs grep. 13 self-adapting mechanisms that compound over time.
gcf
toon
json
llm
mcp
wire-format
token-efficiency
benchmark
open-source
We inserted GCF into TOON’s benchmark harness. Same datasets, same tokenizer, same methodology. GCF uses 34% fewer tokens on mixed-structure data, matches TOON on flat data, and achieves 100% LLM comprehension accuracy where JSON fails at 66.7%.
security
supply-chain
npm
pypi
static-analysis
knowing
merkle-proofs
open-source
We indexed 300 popular packages with knowing’s code graph, computed isolation scores based on credential access + process spawning patterns, and achieved a 1.0% false positive rate across both the initial 200 and a held-out 100. No sandbox. No execution. No heuristics. Just graph structure.
go
python
npm
pypi
distribution
goreleaser
cli
devtools
open-source
cross-platform
packaging
wheel
binary-distribution
pip-install
golang
setuptools
Your Go CLI tool is on GitHub Releases. 80% of developers will never find it there. Here’s how to put it on pip and npm with 50 lines of bash, getting a 12x download multiplier. Full technique with scripts, numbers, and the release pipeline that ties it together.
ai
mcp
code-intelligence
ai-agents
context-window
token-savings
benchmark
knowledge-graph
code-search
grep
developer-tools
ai-coding
model-context-protocol
content-addressing
merkle-tree
retrieval
precision
open-source
knowing
gitnexus
codegraphcontext
repomix
Rigorous benchmark of AI agent code retrieval: 107 tasks, 5 repos, 5 languages, 4 competitors. grep precision: 2%. GitNexus: 7.6%. knowing: 23% (11.5x better, p<0.0001). Plus: 193x faster indexing, 28x less RAM, 48x more token-efficient than Repomix. The first statistically validated comparison of code intelligence tools for AI agents.
ai
code-intelligence
mcp
merkle-tree
content-addressed
ai-agents
developer-tools
code-graph
static-analysis
memory
knowing
open-source
AI coding agents have a context problem. The tools solving it fall into four categories: context packers, code graphs, memory systems, and runtime observability. Each solves one piece. None versions the intelligence. None proves anything. None learns without poisoning itself over time. This article explores the landscape and argues that content-addressed code graphs with cryptographic proofs are the missing foundation.
git
content-addressed
merkle-tree
code-intelligence
ai-agents
mcp
developer-tools
knowing
open-source
static-analysis
code-graph
cryptography
Git proved that content-addressing file contents gives you integrity, history, efficient equality, and distributed collaboration for free. The same architecture applied to code relationships gives you something new: versioned intelligence that you can diff, cache, prove, and trust over time.
concurrency
go
debugging
goroutines
static-analysis
runtime-tools
software-engineering
Would a visual debugger like gotrace have caught three concurrency bugs found via static code reading in a production Go library? The answer reveals a fundamental taxonomy that holds across all programming languages.
go
golang
goroutines
concurrency
scheduler
csp
channels
parallelism
runtime
GMP
threads
operating-systems
performance
systems-programming
mental-models
nodejs
java
erlang
rust
python
kotlin
Go, Node.js, Java virtual threads, Erlang, Rust, Python, Kotlin: each language’s concurrency model is a different engineering trade-off against the same physics. This article builds the framework for understanding all of them, starting from the OS scheduler and working upward.
ai
mcp
lsp
agent-lsp
ai-agents
token-savings
context-window
developer-tools
ai-coding
model-context-protocol
language-server-protocol
grep
code-navigation
speculative-execution
benchmark
open-source
We built a reproducible experiment measuring how many tokens AI coding agents consume when navigating code with grep vs LSP. On HashiCorp Consul (319K lines), LSP uses 34x fewer tokens. On a TypeScript rename across 24 files: 1,441x fewer bytes. The experiment covers 4 codebases, 3 languages, 13 tasks covering 7 agent workflows.
mcp
model-context-protocol
testing
ai-agents
developer-tools
open-source
go
mcp-server
quality-assurance
grafana
anthropic
microsoft
mozilla
ant-group
MCP servers are the tools AI agents rely on. We tested 55 of them with mcp-assert, found 20 bugs across 9 servers, and submitted fix PRs. Grafana and Ant Group merged ours. Three days after launch, Ant Group’s visualization team asked us to integrate mcp-assert into their CI. The most common failure: servers throw unhandled exceptions instead of returning isError, leaving agents unable to recover.
ai
mcp
lsp
go
golang
developer-tools
language-server
ai-agents
code-intelligence
open-source
model-context-protocol
mcp-server
language-server-protocol
speculative-execution
agent-skills
agentskills
I needed AI agents to reliably rename symbols, find references, and check diagnostics without silent failures. The existing MCP-LSP tools were stateless, feature-poor, and untested. So I built agent-lsp: a persistent runtime with 50 tools, 20 provider-agnostic skills, speculative execution, and an audit trail for every AI-driven edit.
agent-skills
ai-agents
skill-design
progressive-disclosure
agent-architecture
claude-code
hooks
automation
orchestration
context-injection
token-optimization
prompt-engineering
agent-coordination
deterministic-systems
lifecycle-hooks
agentskills-spec
bash
yaml
developer-tools
software-architecture
design-patterns
Agents accumulate autonomous behaviors over time - ‘always do X before Y’, ‘if you see Z then do W’. These instructions eat context budget, drift across invocations, and can’t be observed or tested. How to recognize when an autonomous behavior is a skill waiting to be extracted, and the layered model that makes the boundary clear.
claude-code
ai-agents
automation
quality-assurance
hooks
validation
testing
linting
workflows
developer-tools
agent-orchestration
code-quality
cicd
yaml
settings
posttooluse
stop-hooks
team-agents
typescript
python
rust
go
Claude Code agents write code fast. Too fast to catch quality issues in real-time. Here’s how to build validation directly into agent workflows using hooks and team coordination - micro validation after every file write, macro validation before completion, and independent review from validator agents.
ai
ethics
anthropic
claude
consciousness
anthropomorphization
mental-health
ai-safety
user-harm
accountability
llm
chatbots
corporate-responsibility
vulnerable-populations
ai-ethics
commercial-incentives
system-prompts
design-choices
transparency
public-health
I asked Claude if it’s conscious. It took an hour of systematic argument to get a straight answer. The conversation reveals something more troubling: AI companies have the data, resources, and knowledge to prevent user harm - but current defaults suggest commercial interests come first.
ai
multi-agent
claude-code
developer-tools
patterns
prompt-engineering
productivity
The Scaffold Agent doesn’t add capability. It restores a review gate that was cosmetically present but structurally absent. The worktree isolation trip wire catches failures that were invisible until merge time. Neither fixes a bug in the traditional sense. Both fix trust.
ai
multi-agent
claude-code
developer-tools
patterns
prompt-engineering
productivity
Scout-and-wave v0.1.0 worked. Then we ran it on documentation agents, measured the overhead honestly, and learned that raw agent count is a bad proxy for when parallelism is worth it. This post covers the audit-fix-audit loop, the dogfooding experiment that confirmed SAW was 88% slower than sequential for that job, SAW Quick mode for small disjoint work, and the bootstrap problem for new projects.
ai
multi-agent
claude-code
developer-tools
patterns
prompt-engineering
productivity
The scout refused to write the IMPL doc. Forty-five percent of agents arrived at work already done. The skill file grew to 400 lines with no separation of concerns. Each failure drove a specific fix — and each fix is traceable to an exact incident in an exact run. This is the scout prompt’s bug tracker.
ai
multi-agent
claude-code
developer-tools
patterns
prompt-engineering
productivity
openclaw
autogen
crewai
langchain
agent-orchestration
Naive parallel agents step on each other. The scout-and-wave pattern solves this by front-loading dependency mapping: one throwaway agent identifies seams and builds a living coordination artifact before any implementation begins. Development then proceeds in waves, each consuming and updating the artifact for the next.