Genkit Middleware: Tool Approval Gates and Retry Logic
Getting a Genkit agent to work is one problem. Getting it to work reliably in production is another.
Google released @genkit-ai/middleware on May 14, 2026, as a companion package to the Genkit open-source framework. The announcement described it as "composable hooks that intercept generation calls, including the tool execution loop, and inject custom behaviors." In practice, it's five middleware types that address specific failure modes in agentic pipelines: tools calling things they shouldn't, API failures causing silent drops, filesystem operations escaping their intended scope, and models lacking the context to act correctly.
Effloow Lab installed @genkit-ai/middleware@0.6.0 locally and verified the full API surface without API credentials. The lab note is at data/lab-runs/genkit-middleware-agentic-pipeline-hardening-poc-2026.md.
What the Package Is
@genkit-ai/middleware is a separate npm package that ships alongside the Genkit framework. It's not part of the core genkit package. The package is Apache-2.0 licensed, lives in the genkit-ai/genkit GitHub repository, and as of v0.6.0 depends only on mime-types alongside Genkit itself.
The package exports five middleware factories: toolApproval, retry, filesystem, skills, and fallback. Each is a function that returns a middleware object consumable by Genkit's use: parameter in ai.generate() calls.
Install:
npm install @genkit-ai/middleware
# or
pnpm add @genkit-ai/middleware
Verified installation in /tmp/effloow-genkit-middleware-poc: 471 packages, no errors.
The Hook Architecture
Every generate() call in Genkit runs a tool loop: the model produces output, any requested tools execute, the results feed back into a new model call, and the cycle repeats until the model finishes. Middleware intercepts this loop at three layers:
- generate hooks — wrap the full conversation turn
- model hooks — wrap individual model API calls
- tool hooks — wrap each tool execution
The use: parameter accepts an array of middleware objects that stack on top of each other. This means you can compose multiple middleware types in a single call:
const response = await ai.generate({
model: 'gemini-2.5-flash',
prompt: 'Analyze the workspace and write a summary',
use: [
retry({ maxRetries: 3 }),
filesystem({ rootDirectory: './workspace', allowWriteAccess: true }),
toolApproval({ approved: ['list_files', 'read_file'] }),
]
});
toolApproval: Gate Every Tool Call
toolApproval is the most directly useful for security-conscious deployments. It restricts tool execution to a named allow-list. Any tool call not on the list throws a ToolInterruptError, which pauses the generate loop and returns control to the calling code.
The API surface verified from source:
// Config schema (Zod):
ToolApprovalOptionsSchema = z.object({
approved: z.array(z.string()) // list of approved tool names
})
The interrupt pattern allows you to surface the pending tool call to the user, collect approval, and resume:
import { genkit, restartTool } from 'genkit';
import { toolApproval } from '@genkit-ai/middleware';
// First call — empty approved list means everything requires approval
const response = await ai.generate({
prompt: 'Write a summary to output/report.md',
tools: [writeFileTool],
use: [toolApproval({ approved: [] })]
});
if (response.finishReason === 'interrupted') {
const interrupt = response.interrupts[0];
const toolName = interrupt.toolRequest.name;
const toolInput = interrupt.toolRequest.input;
// Show user: agent wants to call `writeFile` with input {path: 'output/report.md', ...}
// User approves → resume with approval marker
const approvedPart = restartTool(interrupt, { toolApproved: true });
const resumedResponse = await ai.generate({
messages: response.messages,
resume: { restart: [approvedPart] },
use: [toolApproval({ approved: [] })]
});
}
The resume mechanism works via metadata.resumed.toolApproved. The middleware checks this flag before throwing; if it's true, execution proceeds. This means the approval state travels with the resumed call, not as a separate configuration change.
One important scope point: toolApproval gates tool calls, not tool definitions. A tool needs to be defined and available to the model before toolApproval can gate it. The middleware doesn't prevent the model from knowing a tool exists; it prevents the tool from executing without approval.
retry: Handle Transient API Failures
The retry middleware wraps model API calls with exponential backoff. Its defaults, verified from source:
| Field | Default | Notes |
|---|---|---|
| maxRetries | 3 | Maximum retry attempts |
| statuses | UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED, ABORTED, INTERNAL | Trigger conditions |
| initialDelayMs | 1000 | First retry delay |
| maxDelayMs | 60000 | Retry delay cap |
| backoffFactor | 2 | Exponential multiplier |
| disableJitter | false | Jitter enabled by default |
The critical design note from the source code: only the model call is retried, not the surrounding tool loop. If the model succeeds and a tool fails, the tool failure propagates normally. This is intentional — retrying the tool loop would require re-executing tool calls that may have already had side effects.
import { retry } from '@genkit-ai/middleware';
const response = await ai.generate({
model: googleAI.model('gemini-pro-latest'),
prompt: 'Intensive reasoning task...',
use: [retry({ maxRetries: 5, backoffFactor: 3 })]
});
The RESOURCE_EXHAUSTED trigger is particularly relevant for production use — it handles rate-limited API responses automatically, which is the most common transient failure mode for teams running high-volume agent workloads.
filesystem: Scoped File Access
filesystem injects four tools into the generate call: list_files, read_file, write_file, and search_and_replace. All operations are restricted to the rootDirectory path. Write access requires explicit opt-in:
import { filesystem } from '@genkit-ai/middleware';
const response = await ai.generate({
model: 'gemini-2.5-flash',
prompt: 'Create a hello world Node.js app in the workspace',
use: [
filesystem({ rootDirectory: './workspace', allowWriteAccess: true })
]
});
The filesystem middleware is a direct alternative to defining file tools manually. Without it, you'd need to implement safe path resolution, directory traversal prevention, and write access controls yourself. With it, those constraints are pre-built and declarative.
The scoped design matters. An agent with unconstrained filesystem access can read credentials, modify configuration, or traverse to parent directories. rootDirectory enforcement at the middleware layer means the model can't be prompted into escaping the workspace even if an adversarial instruction attempts it.
Note: the middleware doesn't sandbox the Node.js process itself. If you need stronger isolation, combine it with an external sandbox like E2B or Cloudflare Dynamic Workers.
skills: Dynamic Instruction Injection
skills addresses a different problem: getting the model to know what it's allowed to do in this specific environment. It scans a directory for SKILL.md files, reads their YAML frontmatter, and injects them into the model's system prompt at generation time.
import { skills } from '@genkit-ai/middleware';
const response = await ai.generate({
prompt: 'How do I run tests in this repo?',
use: [skills({ skillPaths: ['./skills'] })]
});
A SKILL.md at ./skills/testing.md might look like:
---
name: run-tests
description: Run the test suite for this project
---
Use `npm test` to run all tests. For a specific file: `npm test -- path/to/test.spec.ts`
Integration tests require `TEST_DB_URL` environment variable.
The middleware also provides a use_skill tool that the model can call to retrieve specific skill content on demand, rather than loading everything into the system prompt upfront. This is relevant for agents with large skill libraries where loading all context at once would waste tokens.
fallback: Model Chain on Error
fallback provides a model fallback chain. If the primary model API returns an error, the middleware automatically retries with the next model in the chain. This is distinct from retry — retry handles transient errors on the same model, while fallback switches to a different model on persistent failure.
The middleware is documented in the package source but not yet covered in the official docs page, suggesting it's a newer addition. The source confirms the export name is fallback.
Composing Multiple Middleware
Middleware objects stack in order. A reasonable production composition for an agent with file access:
const response = await ai.generate({
model: googleAI.model('gemini-2.5-flash'),
prompt: userRequest,
use: [
fallback([googleAI.model('gemini-2.5-pro')]), // model fallback
retry({ maxRetries: 3 }), // retry transient failures
filesystem({ // scoped file access
rootDirectory: './workspace',
allowWriteAccess: false // read-only by default
}),
toolApproval({ approved: ['list_files', 'read_file'] }), // gate write tools
skills({ skillPaths: ['./skills'] }), // inject context
]
});
The ordering matters. retry should be early in the chain so it catches failures from any subsequent layer. toolApproval should be after filesystem so it can gate the filesystem tools after they've been injected.
Language Support
As of v0.6.0, the middleware package is available in TypeScript, Go, and Dart. Python support is listed as coming soon. The official announcement mentions it's integrated into the Genkit Developer UI, where developers can inspect middleware behavior in the trace view.
What This Changes for Genkit Agent Development
Before this package, building a Genkit agent with tool approval required custom middleware or wrapper code. The toolApproval pattern specifically — interrupt, surface to user, resume with approval — is now a first-class primitive. That matters because the interrupt/resume pattern is the right model for human-in-the-loop agent security; it's just been tedious to implement.
The filesystem middleware is the other change that will have immediate practical impact. A significant fraction of Genkit agents need to read or write files. Having that as a one-liner with scoped access removes a common source of production bugs.
Limitations
Effloow Lab did not run live generate() calls — no Google API key was available in the sandbox. All API surface verification was done by loading the compiled package modules and inspecting their exports. The behavior documented here matches the README, source code, and official announcement, but has not been tested against an actual model endpoint.
Python support is not yet available. Teams building Genkit agents in Python can't use this middleware yet and would need to implement equivalent patterns manually.
The middleware doesn't provide cryptographic guarantees — toolApproval can be bypassed if the metadata.resumed.toolApproved flag is forged in the calling code. It's an in-process control, not a security boundary.
Summary
@genkit-ai/middleware@0.6.0 ships five composable middleware types that address common agentic production failure modes. toolApproval provides a principled interrupt/resume pattern for human-in-the-loop tool gating. retry handles transient API failures with sensible defaults. filesystem gives scoped file access as a one-liner. skills injects environment context into the model's prompt. fallback provides model chain resilience.
The package is Apache-2.0, available on npm, and works today in TypeScript, Go, and Dart. Effloow Lab verified the full API surface locally at v0.6.0 with 471 packages installed successfully. No API credentials required for the tool approval interrupt pattern to be exercised at the code level.
Sources: Google Developers Blog (developers.googleblog.com/announcing-genkit-middleware), InfoQ (infoq.com/news/2026/05/google-genkit-middleware/), npm (@genkit-ai/middleware), Genkit docs (genkit.dev/docs/js/middleware/).
Lab note: data/lab-runs/genkit-middleware-agentic-pipeline-hardening-poc-2026.md
Need content like this
for your blog?
We run AI-powered technical blogs. Start with a free 3-article pilot.