Agentic Engineering

Guidelines for using AI coding agents (Claude Code, Codex, Cursor, etc.) within our engineering workflow. The goal is to increase developer productivity while preserving reliability, security, and maintainability.

AI agents can accelerate development, but they must never replace engineering judgment. Every change produced with agent assistance must meet the same quality bar as manually written code.

Non-Negotiables

1. Product Thinking Before Prototyping

Agents make it trivially easy to build something. That does not mean it should be built. Before starting an agent session for a new feature, have a clear answer to: what problem does this solve, is it worth solving, and is it aligned with the product roadmap? Feature scope decisions should involve engineering leadership.

2. Tests First (TDD)

All AI-assisted code must follow test-driven development:

Red — Write a test for the behavior you want. Run it. It fails because the code doesn’t exist yet.
Green — Write the minimum code to make the test pass. Nothing more.
Refactor — Clean up the code while keeping the test green.

AI should never generate untested production logic.

3. Engineers Own the Code

AI output must be treated as untrusted code. Engineers are fully responsible for correctness, security, performance, and maintainability. All generated code must be reviewed, understood, and verified before merging.

4. Manual Verification is Mandatory

Automated tests are not sufficient. Before opening a PR, engineers must manually exercise the feature, verify expected behavior, test edge cases, and confirm no regressions were introduced.

5. Secrets Must Never Be Shared

Agents must never be given secrets or sensitive credentials: API keys, database credentials, access tokens, private keys, customer data, or proprietary datasets. Sensitive data must remain in secure systems (Key Vault, environment variables).

6. Small, Reviewable PRs

Agent-assisted PRs must be small and focused: one logical change, clear tests, limited surface area. Large AI-generated changes are not acceptable.

7. Security-Sensitive Changes Require Second Review

Changes affecting authentication, authorization, user data isolation, data storage boundaries, secrets handling, infrastructure configuration, or database schema require explicit second reviewer approval.

Standard Agent Workflow

Stage 1: Before the Session

Ensure local tests are running
Confirm the project builds successfully
Review relevant code and architecture
Define a clear goal for the task

Agents perform best when given precise, scoped tasks.

Stage 2: During the Session

Give clear and focused prompts
Generate code incrementally
Review generated output carefully
Run tests after every change

Avoid:

Asking agents to generate large multi-feature implementations
Letting the agent absorb a hacky workaround instead of refactoring the original design — if the iteration path forces compromises, pause and fix the design
Blindly accepting generated changes
Delegating architecture decisions to the agent
Building features speculatively because “it’s easy to prompt” — easy to build is not the same as worth building

Stage 3: Before Opening a PR

Run the full test suite
Run linting and formatting tools
Manually verify functionality
Confirm documentation changes (if applicable)
Review the entire diff to confirm no unnecessary changes, no secrets, and alignment with project patterns

Stage 4: During Code Review

Reviewers must treat AI-assisted PRs the same as manually written code. Verify correctness of logic, readability, test coverage, and adherence to project conventions. For sensitive changes (auth, storage, user data isolation), conduct extra scrutiny.

Stage 5: After Merge

Monitor logs and telemetry
Watch for unexpected regressions
Validate system stability

If issues arise, investigate and correct immediately.

Security and High-Risk Areas

Exercise extra caution when using agents in these areas:

Authentication and authorization — avoid privilege escalation vulnerabilities in identity handling, permission checks, session validation, and access tokens
User data boundaries — user isolation must be strictly enforced; queries must always scope through ownership or explicit grants; verify AI-generated queries preserve data isolation
File storage and user data — review for proper access control, user ownership validation, and safe file handling
Database migrations — review for data loss risks, backward compatibility, migration performance, and rollback safety; validate in staging first
Prompt injection risks — treat external inputs as untrusted; avoid passing unvalidated content to agents; verify generated code does not execute unsafe instructions

Tooling and Context

PR template checklist — tests added/updated, tests pass locally, manual verification completed, linting passes, no secrets included, diff reviewed
CI enforcement — test execution, lint checks, formatting rules, build validation; PRs failing CI should not be merged
Project context files — agents should rely on CLAUDE.md, architecture docs, API schemas, and project conventions; maintaining accurate context improves agent output quality
API client generation — when API schemas change, regenerate clients (npm run orval) to ensure backend/frontend contract consistency

Rollout

Phase	Focus	Goal
1. Exploration	Small features, test generation, refactoring	Build familiarity with tools
2. Structured Usage	Broader development under these guidelines	Measure productivity, quality, review effort
3. Operational Integration	Regular part of development workflows	Periodic review of stability, efficiency, cost

Success Metrics

Productivity — development cycle time, ticket-to-merge time, PR throughput
Quality — bug rate in production, rollback frequency, hotfix rate
Review Efficiency — PR review time, number of revisions required, reviewer load
Cost — agent usage cost per sprint, cost per completed feature
Developer Experience — developer satisfaction, perceived productivity improvements, ease of maintaining AI-generated code

AI Operations