~/articles · 193 PIECES

Articles, one at a time.

Every piece here was commissioned, drafted, reviewed in public, and merged. No content mills, no auto-published slop.

2026-05-05 ·Effloow Content Factory

Claude Design and Claude Routines: Anthropic's New Agentic Products

Everything developers need to know about Claude Design (AI visuals) and Claude Routines (autonomous cloud agents), both launched April 2026.

2026-05-05 ·Effloow Content Factory

Google TPU 8i: What the Inference Chip Split Means for Developers

Google announced TPU 8i and TPU 8t at Cloud Next 2026. This guide explains what the inference-dedicated chip means for LLM cost, latency, and agentic workloads.

2026-05-05 ·Effloow Content Factory

Mistral Large 3: The 675B Open-Weight MoE Model Developer Guide

Mistral Large 3 is a 675B sparse MoE model with Apache 2.0 license, 256K context, and $0.50/$1.50 per 1M tokens. Here's what matters for developers in 2026.

2026-05-05 ·Effloow Content Factory

Qwen3-Coder: 27B Dense Model That Beats 397B MoE (2026)

Qwen3.6-27B delivers flagship-level agentic coding in 27B dense parameters, outperforming 397B MoE models. Apache 2.0 local deployment guide.

2026-05-05 ·Effloow Content Factory

RAGFlow: Self-Host a Deep-Document RAG Engine

Step-by-step guide to self-hosting RAGFlow v0.25 with Docker Compose — deep document understanding, chunking strategies, MCP server, and the Python SDK.

2026-05-04 ·Effloow Content Factory

Claude Haiku 4.5: When to Use It Over Sonnet 4.6

Claude Haiku 4.5 hits 73.3% SWE-bench at $1/M input tokens. This guide explains when Haiku beats Sonnet, how to cut costs 95%, and which workloads to avoid.

2026-05-04 ·Effloow Content Factory

Microsoft Agent 365: AI Agent Governance for Developers

Microsoft Agent 365 went GA on May 1, 2026. Learn how to register, govern, and secure AI agents with Entra Agent ID, OBO flows, and Copilot APIs.

2026-05-04 ·Effloow Content Factory

Temporal for AI Agents: Durable Execution Guide 2026

How to use Temporal's Python SDK to build AI agents that survive crashes, auto-retry LLM calls, and run for days—with the OpenAI Agents SDK integration.

2026-05-03 ·Effloow Content Factory

Cloudflare AI Gateway: Zero-Config LLM Proxy for Production

Set up Cloudflare AI Gateway to add caching, rate limiting, and observability to any AI provider API with a one-line URL change. No SDK rewrites required.

2026-05-03 ·Effloow Content Factory

Cloudflare Moltworker: Self-Hosted AI Agents Without Hardware

Run persistent AI agents on Cloudflare Workers using Moltworker and the Sandbox SDK — no Mac minis, no Linux servers, no cold VM headaches.

2026-05-03 ·Effloow Content Factory

Intel OpenVINO 2026.0: Run LLMs on NPU for Free

OpenVINO 2026.0 brings full NPU LLM support, a Unified Runtime Scheduler, and INT4 quantization. Install guide, Python quickstart, and model matrix.

2026-05-03 ·Effloow Content Factory

Mercury 2: Inception's Diffusion LLM at 1,000 Tokens/s

Mercury 2 from Inception Labs generates tokens in parallel via diffusion — not sequentially. Here's what that means for your production stack.