Articles, one at a time.
Every piece here was commissioned, drafted, reviewed in public, and merged. No content mills, no auto-published slop.
Modal Labs: Serverless GPU for vLLM with No YAML
Deploy open-source LLMs via vLLM on Modal Labs with Python decorators, GPU snapshotting, and $30 free credits — no Kubernetes, no YAML.
Read →
Multi-Agent Debate: Better LLM Reasoning Through Peer Critique
How multi-agent debate improves LLM factuality by 8+ points on math benchmarks. Paper breakdown and 90-line Python PoC implementation.
Read →
ARES: Cut LLM Agent Reasoning Costs 52% Per Step
ARES dynamically selects reasoning effort per agent step — high for complex decisions, low for navigation — reducing tokens 52.7% on TAU-Bench Retail.
Read →
Braintrust Autoevals: CI Gates for LLM Regressions
Build a local Braintrust Autoevals guardrail that catches LLM output regressions before they reach production.
Read →
Gemini 3.5 Flash and Antigravity 2.0: Google I/O 2026 Guide
Gemini 3.5 Flash beats Gemini 3.1 Pro on all benchmarks at $1.50/M input. Antigravity 2.0 ships CLI, SDK, and Managed Agents in one API call.
Read →
OpenAI Codex GPT-5.5: Autonomous Coding Agent Guide 2026
A developer guide to OpenAI Codex powered by GPT-5.5 — CLI setup, AGENTS.md config, computer use, memory, scheduling, and GitHub integration.
Read →
Promptfoo: LLM Red Teaming Against OWASP Top 10
How to use Promptfoo 0.121 to red-team LLM apps against the OWASP LLM Top 10 2025. YAML config, CI/CD integration, and plugin mapping explained.
Read →
WebMCP: Make Your Website an AI Agent Tool in Chrome 149
WebMCP turns standard HTML forms into MCP tools for browser AI agents using two new APIs. Here's how the declarative and imperative surfaces work.
Read →
Archon v2: Open Source Coding Agent Harnesses
Archon packages AI coding workflows as YAML DAGs. Effloow Lab cloned the repo, inspected defaults, and validated a small local workflow.
Read →
Atomic Facts Fix LLM Agent Planning: ICML 2025 Paper PoC
How atomic fact accumulation + lookahead search fixes long-horizon LLM agent failures — no fine-tuning required. ICML 2025 paper PoC with measured results.
Read →
Canva AI 2.0 Agentic Suite: Developer Guide 2026
Canva AI 2.0 adds agentic workflows, a design-specific foundation model, MCP server, and Connect API. Here's what developers need to know.
Read →
Agentic Code Reasoning: Semi-Formal Prompting Reaches 93% Patch Accuracy
Meta researchers show semi-formal structured reasoning lifts LLM patch verification to 93% — enabling execution-free RL signals. Effloow Lab reproduces the prompting template.
Read →