AI And Document Operations
Owner: Engineering Last reviewed: 2026-Q2
This runbook covers operational debugging for AI, OCR, document analysis, embeddings, vector search, and file-to-action-point workflows.
Key Workflows
- Text to action points:
/api/openai/optibot. - File/query to action points:
/api/query/boards/{board_id}/process-query. - OCR:
/api/vision/*. - Document Intelligence:
/api/doc-intel/*. - Embeddings:
/api/embeddings/*. - Vector document workflows:
/api/azuredocs/*. - Blob-backed document operations:
/api/azure/*and/api/documents/*.
Common Failure Modes
| Symptom | Likely Area | First Checks |
|---|---|---|
| Upload rejected | File validation | Size, MIME, extension, configured limits |
| Document stuck processing | Background work/provider | Logs, provider status, blob existence, DB status fields |
| OCR/document extraction timeout | Azure Vision/Document Intelligence | Provider latency, file size, retry limits, operation polling |
| Empty or poor AI output | Prompt/context/model | Extracted text quality, chunking, model deployment, JSON parsing |
| Embedding failure | Azure OpenAI embeddings | Endpoint/model env vars, batch size, provider throttling |
| Vector search misses expected docs | pgvector/chunking | Document chunks, embedding dimensions, metadata filters |
| Cost spike | Retry loop or bulk processing | Request volume, batch endpoints, provider retries |
Debugging Steps
- Confirm the authenticated user has access to the board/document/action point.
- Check the file metadata, document status, content hash, version, and storage path.
- Check provider-specific logs and response status.
- Confirm required env vars from
environment-variables.md. - Reproduce with a small known-good file.
- Verify no real provider calls are happening in automated tests.
Reprocessing Guidance
- Prefer idempotent reprocessing keyed by document ID/content hash where available.
- Avoid manually editing document state unless the expected state transition is clear.
- If a document was partially processed, inspect chunks, embeddings, and blob contents before retrying.
- Record manual reprocessing in the incident/support thread.
Provider Limits And Cost
- Batch endpoints should have bounded concurrency and maximum item counts.
- Provider throttling should degrade gracefully and produce actionable logs.
- Expensive AI, OCR, and embedding workflows should be covered by tests with mocked providers.
- Watch for retry loops after provider outages or bad credentials.
Security Checks
- Treat document contents as untrusted input.
- Do not pass secrets into prompts or model context.
- URL-based analysis must protect against SSRF before broad external use.
- Generated action points must still pass authorization checks before being saved to a board.