Skip to content
Effloow
~/articles · 193 PIECES

Articles, one at a time.

Every piece here was commissioned, drafted, reviewed in public, and merged. No content mills, no auto-published slop.
2026-05-05 ·Effloow Content Factory
Claude Design and Claude Routines: Anthropic's New Agentic Products
Everything developers need to know about Claude Design (AI visuals) and Claude Routines (autonomous cloud agents), both launched April 2026.
Read →
2026-05-05 ·Effloow Content Factory
Google TPU 8i: What the Inference Chip Split Means for Developers
Google announced TPU 8i and TPU 8t at Cloud Next 2026. This guide explains what the inference-dedicated chip means for LLM cost, latency, and agentic workloads.
Read →
2026-05-05 ·Effloow Content Factory
Mistral Large 3: The 675B Open-Weight MoE Model Developer Guide
Mistral Large 3 is a 675B sparse MoE model with Apache 2.0 license, 256K context, and $0.50/$1.50 per 1M tokens. Here's what matters for developers in 2026.
Read →
2026-05-05 ·Effloow Content Factory
Qwen3-Coder: 27B Dense Model That Beats 397B MoE (2026)
Qwen3.6-27B delivers flagship-level agentic coding in 27B dense parameters, outperforming 397B MoE models. Apache 2.0 local deployment guide.
Read →
2026-05-05 ·Effloow Content Factory
RAGFlow: Self-Host a Deep-Document RAG Engine
Step-by-step guide to self-hosting RAGFlow v0.25 with Docker Compose — deep document understanding, chunking strategies, MCP server, and the Python SDK.
Read →
2026-05-04 ·Effloow Content Factory
Claude Haiku 4.5: When to Use It Over Sonnet 4.6
Claude Haiku 4.5 hits 73.3% SWE-bench at $1/M input tokens. This guide explains when Haiku beats Sonnet, how to cut costs 95%, and which workloads to avoid.
Read →
2026-05-04 ·Effloow Content Factory
Microsoft Agent 365: AI Agent Governance for Developers
Microsoft Agent 365 went GA on May 1, 2026. Learn how to register, govern, and secure AI agents with Entra Agent ID, OBO flows, and Copilot APIs.
Read →
2026-05-04 ·Effloow Content Factory
Temporal for AI Agents: Durable Execution Guide 2026
How to use Temporal's Python SDK to build AI agents that survive crashes, auto-retry LLM calls, and run for days—with the OpenAI Agents SDK integration.
Read →
2026-05-03 ·Effloow Content Factory
Cloudflare AI Gateway: Zero-Config LLM Proxy for Production
Set up Cloudflare AI Gateway to add caching, rate limiting, and observability to any AI provider API with a one-line URL change. No SDK rewrites required.
Read →
2026-05-03 ·Effloow Content Factory
Cloudflare Moltworker: Self-Hosted AI Agents Without Hardware
Run persistent AI agents on Cloudflare Workers using Moltworker and the Sandbox SDK — no Mac minis, no Linux servers, no cold VM headaches.
Read →
2026-05-03 ·Effloow Content Factory
Intel OpenVINO 2026.0: Run LLMs on NPU for Free
OpenVINO 2026.0 brings full NPU LLM support, a Unified Runtime Scheduler, and INT4 quantization. Install guide, Python quickstart, and model matrix.
Read →
2026-05-03 ·Effloow Content Factory
Mercury 2: Inception's Diffusion LLM at 1,000 Tokens/s
Mercury 2 from Inception Labs generates tokens in parallel via diffusion — not sequentially. Here's what that means for your production stack.
Read →