Articles, one at a time.
Every piece here was commissioned, drafted, reviewed in public, and merged. No content mills, no auto-published slop.
Claude Haiku 4.5: When to Use It Over Sonnet 4.6
Claude Haiku 4.5 hits 73.3% SWE-bench at $1/M input tokens. This guide explains when Haiku beats Sonnet, how to cut costs 95%, and which workloads to avoid.
Read →
Microsoft Agent 365: AI Agent Governance for Developers
Microsoft Agent 365 went GA on May 1, 2026. Learn how to register, govern, and secure AI agents with Entra Agent ID, OBO flows, and Copilot APIs.
Read →
Temporal for AI Agents: Durable Execution Guide 2026
How to use Temporal's Python SDK to build AI agents that survive crashes, auto-retry LLM calls, and run for days—with the OpenAI Agents SDK integration.
Read →
Cloudflare AI Gateway: Zero-Config LLM Proxy for Production
Set up Cloudflare AI Gateway to add caching, rate limiting, and observability to any AI provider API with a one-line URL change. No SDK rewrites required.
Read →
Cloudflare Moltworker: Self-Hosted AI Agents Without Hardware
Run persistent AI agents on Cloudflare Workers using Moltworker and the Sandbox SDK — no Mac minis, no Linux servers, no cold VM headaches.
Read →
Intel OpenVINO 2026.0: Run LLMs on NPU for Free
OpenVINO 2026.0 brings full NPU LLM support, a Unified Runtime Scheduler, and INT4 quantization. Install guide, Python quickstart, and model matrix.
Read →
Mercury 2: Inception's Diffusion LLM at 1,000 Tokens/s
Mercury 2 from Inception Labs generates tokens in parallel via diffusion — not sequentially. Here's what that means for your production stack.
Read →
Microsoft Agent Governance Toolkit: OWASP Agentic AI Top 10
Microsoft's open-source Agent Governance Toolkit maps to all 10 OWASP Agentic AI risks. Policy enforcement, zero-trust identity, EU AI Act compliance explained.
Read →
POLARIS: Typed DAG Planning for Governed AI Agents
How the POLARIS framework uses typed DAG planning and policy guardrails to make agentic AI safe for enterprise back-office automation.
Read →
Devstral 2: Run Mistral's Open Coding Agent Locally
Set up Devstral 2 or Devstral Small 2 locally with Ollama. 72.2% SWE-bench, 256K context, Apache 2.0 — the best open coding agent you can self-host.
Read →
Gemini 3.1 Flash TTS: Production API Guide for Developers
Set up Gemini 3.1 Flash TTS in your app. Covers audio tags, multi-speaker dialogue, WAV conversion, pricing, and real API call examples.
Read →
Xiaomi MiMo-V2.5-Pro: Open-Source 1T Coding Agent Guide 2026
MiMo-V2.5-Pro: MIT-licensed 1T-param MoE model matching Claude Opus 4.6 on SWE-bench at 8x lower API cost. Benchmarks, API setup, and self-hosting guide.
Read →