Agentic Engineering Beyond Vibe Coding Methodology 2026
Purpose
This PoC tested the smallest useful version of an "agentic engineering" control loop: a proposed code change is only accepted after deterministic verification, and the verification trace is saved as an auditable artifact.
The sandbox did not call an AI model. It simulated a good baseline and a bad agent patch so the article can discuss the engineering boundary without pretending that Effloow ran a production agent workflow.
Commands
rm -rf /tmp/effloow-agentic-engineering-poc
mkdir -p /tmp/effloow-agentic-engineering-poc
cd /tmp/effloow-agentic-engineering-poc
Created:
package.jsoncalculator.mjstest-runner.mjsgate.mjs
Baseline verification:
npm run gate
Output:
> gate
> node gate.mjs
GATE_PASS deterministic verification accepted release
Injected a deliberately bad patch:
perl -0pi -e 's/Math\.round\(cents - \(cents \* percent \/ 100\)\)/cents - percent/' calculator.mjs
npm run gate || true
Output:
> gate
> node gate.mjs
GATE_FAIL deterministic verification blocked release
Trace captured by gate.mjs:
[
{
"label": "unit-test-gate",
"cmd": "node test-runner.mjs",
"status": 1,
"stdout": "PASS zero percent",
"stderr": "FAIL ten percent: expected 900, got 990\nFAIL round half up: expected 874, got 986.5"
}
]
Restored the correct implementation:
perl -0pi -e 's/cents - percent/Math.round(cents - (cents * percent \/ 100))/' calculator.mjs
npm run gate
Output:
> gate
> node gate.mjs
GATE_PASS deterministic verification accepted release
What Worked
- A minimal deterministic gate separated a plausible-looking patch from an acceptable patch.
- The gate produced a machine-readable
trace.jsonartifact containing command, status, stdout, and stderr. - The failing patch was blocked before any release step.
What Failed
- The trace was overwritten on each gate run instead of being appended with immutable run IDs.
- The sandbox used one unit-test gate only. It did not include linting, static analysis, security checks, snapshot tests, or human review.
- The bad patch was manually injected rather than generated by a real coding agent.
Limitations
This PoC proves a workflow shape, not a product benchmark. It does not measure agent coding quality, speed, cost, reliability, or defect rate. It also does not prove that a specific commercial coding agent follows this process. The article should only claim that Effloow Lab ran a local sandbox gate demonstrating why professional agentic workflows need deterministic verification and trace capture.
Read the article
This note supports the public article and records what was actually checked.