Architecture

Architecture

Owner: Engineering Last reviewed: 2026-Q2

This section describes the current system architecture as implemented in this repo.

High-Level

  • Frontend: React + TypeScript (Vite), Tailwind, Radix UI, Clerk for auth, OpenAPI-generated clients (Orval).
  • Backend: FastAPI (Python), async SQLAlchemy, PostgreSQL, Alembic, Clerk JWT verification, Azure integrations (OpenAI, Vision, Document Intelligence, Storage), OpenTelemetry.
  • Infra: Docker images built in Azure DevOps, pushed to ACR, deployed to Azure Web App for Containers; CDN/Front Door in front of frontend.
[Browser]
   │  Clerk session

[React (Vite)] ───calls──▶ [/api/* on FastAPI]
   │                            │
   │                            ├── PostgreSQL (asyncpg)
   │                            ├── Azure Storage (Blob)
   │                            ├── Azure AI: OpenAI, Vision, Doc Intelligence
   │                            └── Telemetry → Azure Monitor (OTel)

Authentication & Authorization

  • Clerk: Frontend obtains tokens via Clerk; backend verifies JWT with Clerk JWKS (see backend/py/core/authentication.py).
  • Dual model (ADR-001):
    • ClerkUser for verified Clerk identity and entitlement claims.
    • Local User for DB relationships, ownership, permissions, and sharing via JIT provisioning.
  • Use get_current_user when you only need verified Clerk identity; use get_current_db_user for DB filters and ownership checks.
  • Board and action-point access is user-scoped through ownership, BoardPermission, ActionPointShare, and board placement.

API Layer

  • Entrypoint: backend/py/main.py defines CORS, security headers, and includes routers.
  • Routers live in backend/py/api/routes/* grouped by capability (kanban, board permissions, action-point sharing, comments, users, notifications, embeddings, document sharing, vector documents, workflows, webhooks, health) plus Azure-specific routes (AI, AI Vision, Blob Storage, Document Intelligence).
  • Response/request schemas live in backend/py/schemas/.
  • OpenAPI is exported (used by frontend Orval to generate clients frontend/src/api/generated). Keep it in sync.

Data Layer

  • DB: PostgreSQL with asyncpg; async engine and sessions in backend/py/db/database.py.
  • ORM: SQLAlchemy 2.x models in backend/py/models/base.py.
  • Migrations: Alembic (backend/py/alembic/*).
  • Common entities: Users, Organizations, Roles, Boards, Columns, ActionPoints, ActionPointBoardAssociation (lexorank position), BoardPermission, ActionPointShare, Comments, Attachments, Documents, DocumentChunk, DocumentPermission, AuditLog, BotInteraction, BotGeneratedActionPoint, Notifications.
  • Vector/embeddings: embeddings stored in arrays/pgvector; services in services/* handle adaptive chunking, embeddings, and vector store (e.g., adaptive_chunker_service.py, openai_embeddings_service.py, vector_store_service.py).

Document & AI Processing

  • Uploads: Size/type constrained in core/config.py; storage via Azure Blob.
  • Analysis: Azure AI Vision/Document Intelligence for OCR and structure, OpenAI (Azure) for embeddings and query workflows.
  • Pipeline: services in services/* provide adaptive chunking (adaptive_chunker_service.py), async document processing (async_document_processor.py), and vector storage (vector_store_service.py).
  • Deduplication/versioning: Document.content_hash, version and deletion flags prevent duplicates and manage lifecycle.

Health, Security, and CORS

  • Health endpoints under /health/* (liveness, readiness, metrics).
  • Security headers middleware (clickjacking, MIME sniffing, HSTS in non-dev).
  • CORS origins vary by ENVIRONMENT (prod, staging, dev) in main.py.

Frontend Architecture

  • Vite + React 19, TypeScript.
  • State/data: React Query for server state; local component state for UI interactions.
  • API client: Orval generates clients from frontend/openapi.jsonsrc/api/generated. Thin wrappers in src/api/* and src/lib/api.ts.
  • Auth: Clerk React SDK (src/config/clerk.ts), user sync overlay/gate to ensure backend JIT provisioning.
  • UI: Tailwind v4, Radix primitives, shadcn-inspired components under src/components/ui.
  • Feature example: Kanban board uses @dnd-kit for drag/drop; board/action-point endpoints live under /api/kanban.

Deployment Topology

  • Azure DevOps pipelines:
    • API: azuredevops/pipelines/api.yaml → Docker build → ACR → Azure Web App for Containers (per dir, e.g., py).
    • Frontend: azuredevops/pipelines/frontend.yaml → Docker build → ACR → Azure Web App for Containers → Front Door cache purge.
  • Environments: dev and showcase active; prod config present but commented. Variable groups per branch.

Non-Goals / Future

  • Background queue/workers are minimal; consider introducing a durable queue for long-running AI/ingestion.
  • Expand automated data retention and encryption management (KEK/DEK fields exist; enforce rotation policy).
  • Consider API versioning strategy as external consumers grow.