Posts

20 of 56 total posts (showing page 1 of 3)

agent-lsp: Reliable Code Intelligence for AI Agents via MCP and LSP

I needed AI agents to reliably rename symbols, find references, and check diagnostics without silent failures. The existing MCP-LSP tools were stateless, feature-poor, and untested. So I built agent-lsp: a persistent runtime with 50 tools, 20 provider-agnostic skills, speculative execution, and an audit trail for every AI-driven edit.

The Agent-Skill Boundary: When Autonomous Behaviors Become Skills

Agents accumulate autonomous behaviors over time - ‘always do X before Y’, ‘if you see Z then do W’. These instructions eat context budget, drift across invocations, and can’t be observed or tested. How to recognize when an autonomous behavior is a skill waiting to be extracted, and the layered model that makes the boundary clear.

Self-Validating Agents: Building Quality Checks into Claude Code Workflows

Claude Code agents write code fast. Too fast to catch quality issues in real-time. Here’s how to build validation directly into agent workflows using hooks and team coordination - micro validation after every file write, macro validation before completion, and independent review from validator agents.

The AI Consciousness Question: A Case Study in Corporate Accountability

I asked Claude if it’s conscious. It took an hour of systematic argument to get a straight answer. The conversation reveals something more troubling: AI companies have the data, resources, and knowledge to prevent user harm - but current defaults suggest commercial interests come first.

Scout-and-Wave, Part 4: Trust Is Structural

The Scaffold Agent doesn’t add capability. It restores a review gate that was cosmetically present but structurally absent. The worktree isolation trip wire catches failures that were invisible until merge time. Neither fixes a bug in the traditional sense. Both fix trust.

Scout-and-Wave, Part 2: What Dogfooding Taught Us

Scout-and-wave v0.1.0 worked. Then we ran it on documentation agents, measured the overhead honestly, and learned that raw agent count is a bad proxy for when parallelism is worth it. This post covers the audit-fix-audit loop, the dogfooding experiment that confirmed SAW was 88% slower than sequential for that job, SAW Quick mode for small disjoint work, and the bootstrap problem for new projects.

Scout-and-Wave, Part 3: Five Failures, Five Fixes

The scout refused to write the IMPL doc. Forty-five percent of agents arrived at work already done. The skill file grew to 400 lines with no separation of concerns. Each failure drove a specific fix — and each fix is traceable to an exact incident in an exact run. This is the scout prompt’s bug tracker.

Scout-and-Wave: A Coordination Pattern for Parallel AI Agents

Naive parallel agents step on each other. The scout-and-wave pattern solves this by front-loading dependency mapping: one throwaway agent identifies seams and builds a living coordination artifact before any implementation begins. Development then proceeds in waves, each consuming and updating the artifact for the next.

Bulletproof SSH: Multi-Identity Git, Socket Persistence, and Zero-Trust Key Management

Most developers cargo-cult their SSH config from Stack Overflow. This is the setup I actually run: three GitHub identities on one machine, persistent control sockets, conditional git configs that auto-select the right key, and pinned known_hosts. No third-party tools.

Branding a CLI Tool in 4 Days: Mascot, Screencasts, and Visual Identity with AI

Most CLI tools ship with no visual identity beyond a help screen. Here’s how I used AI image generation to create Shelby, a consistent mascot with a locked-down spec, and built a complete brand system - poses, screencasts, color palette, terminal theme - for shelfctl in 4 days.

Stop Committing PDFs: Use GitHub Releases as Your Library Backend

Every PDF committed to git history stays there forever, bloating clones even after deletion. Git LFS adds cost and friction. GitHub Release assets offer a better approach: free CDN-backed storage with on-demand downloads, lightweight repos, and built-in migration tools.

Instrumenting Redis for Structural Leak Detection: A jemalloc Deep Dive

Instrumented Redis 7.2 with drainability profiling to measure jemalloc slab fragmentation. Found critical asymmetric accounting bug, fixed with symmetric fastpath instrumentation. Final result: deleting 50% of keys freed 195K objects but achieved 0% drainability - genuine structural fragmentation detected and validated.

Understanding Memory Metrics: RSS, VSZ, USS, PSS, and Working Sets

Why does free show 1GB available but your app OOM’d? Why is RSS 4GB when your heap is 2GB? A complete taxonomy of memory metrics from system level (total, available, cached) to process level (RSS, PSS, USS, WSS) to allocator internals.

Catching Structural Memory Leaks: A Temporal-Slab Case Study

From theory to practice: integrating drainability profiling into temporal-slab. See validation results (DSR = 1.0 - p), diagnostic mode pinpointing slab_lib.c:1829, and step-by-step integration guide for your allocator.

Structural Memory Leaks: Binary Outcomes in Coarse-Grained Reclamation

Memory grows unbounded. Valgrind shows zero leaks. Research proves coarse-grained allocators have a binary asymptotic outcome: satisfy drainability for O(1) retention, violate it for Ω(t) growth. No middle ground.

Go Structs Are Not C++ Classes: Why Similar Modeling Roles Produce Different Hardware Execution Patterns

Structs with methods look like classes, but the hardware tells a different story. Go makes contiguous values + static calls the path of least resistance. In inheritance-heavy C++ designs, you often end up with pointers + virtual dispatch + scattered memory. This isn’t syntax - it’s what the CPU executes.

Artifact-Boundary Productization: Clean OSS/Commercial Separation

The execution boundary determines everything: features that need the system alive belong in the platform (OSS). Features that analyze artifacts after shutdown become the product (commercial). A framework for clean OSS/commercial separation.

Kubernetes Secrets: Should Your Cluster Store Secrets or Just Access Them?

Kubernetes Secrets are simple and often sufficient. But at scale, some teams separate compute from secret storage. Understanding the trade-offs: etcd vs cloud vaults, cluster RBAC vs cloud IAM, sync patterns vs runtime access, and when each pattern makes sense.

How Continuous Fuzzing Finds Bugs Traditional Testing Misses

Traditional tests check examples you think of. Fuzzing explores millions of combinations you don’t. Coverage-guided fuzzing found two production bugs in goldenthread before release - a UTF-8 corruption issue and a regex escaping bug. Here’s how continuous fuzzing works and how to set it up.

How Multicore CPUs Changed Object-Oriented Programming

OOP’s implicit reference semantics were manageable in single-threaded code. But when CPUs went multicore in 2005, hidden shared state went from ‘confusing’ to ‘catastrophic.’ This is why Go and Rust refined OOP: keeping methods and encapsulation while replacing inheritance with composition and implicit references with value semantics.