Architecture Drift Score: Is Your Documentation Telling the Truth?
There's a dirty secret in software architecture: most documentation is wrong.
Not wrong like "contains errors." Wrong like "describes a system that no longer exists." The diagram shows a microservice that was merged into another six months ago. The C4 model lists a Redis cache that was replaced by Memcached during a weekend incident. The component diagram references a PaymentGateway that was renamed to BillingService in a refactor nobody told the architect about.
This isn't a discipline problem. It's a structural one. Code changes continuously. Documentation changes when someone remembers. The gap between reality and documentation — what we call architecture drift — grows silently until the diagram on the wall bears no resemblance to the system in production.
We built the Drift Score to make that gap visible, measurable, and actionable.
A Single Number: 0 to 100
The Architecture Drift Score answers one question: what percentage of your documented architecture actually exists in your codebase?
Open any project in Archyl, click the heartbeat icon in the header, and hit "Compute Drift Score." In a few seconds, you'll see a number between 0 and 100:
- 90-100% — Excellent. Your documentation closely matches the codebase.
- 70-89% — Good. Mostly accurate, some gaps to address.
- 50-69% — Fair. Significant drift detected. Time to update.
- Below 50% — Your documentation is fiction.
That's it. No lengthy report to read. No subjective assessment. A number you can track, trend, and enforce.
What It Actually Checks
The drift analysis is lightweight by design — a single API call to your git provider, no AI, no file content fetched. It validates your architecture across five dimensions:
Systems — Does your repository name match the documented system? We use the same PascalCase naming convention as the AI discovery pipeline, with fuzzy matching so EkoAuthz matches a repo named authz.
Containers — Do the top-level directories in your repo correspond to documented containers? frontend/ matches FrontendWebApp. backend/ matches BackendApiServer. Infrastructure containers (databases, queues, monitoring) that don't have source directories are correctly excluded — they're valid documentation of external services, not drift.
Components — Are the components under each container still valid? If the parent container's directory exists, its components are presumed valid. If the container directory is gone, all its components are flagged.
Code Elements — This is the most precise check. Every code element in your C4 model has a filePath. We verify that each file still exists in the repository. Renamed file? Deleted class? Moved module? The drift score catches it instantly.
Relationships — A relationship is valid if both its source and target elements passed validation. If either endpoint drifted, the relationship is flagged.
The result is a per-element breakdown showing exactly what matched, what's missing, and what's new — not an opaque score, but an actionable report.
Why Lightweight Matters
We deliberately chose not to run the full AI discovery pipeline for drift detection. Here's why:
Speed. AI analysis takes minutes for large repositories. Drift scoring takes seconds. You can run it on every push without slowing down your pipeline.
Determinism. AI can produce different results on the same codebase depending on model temperature, prompt variations, and token limits. File path existence is binary — either the file is there or it isn't. Your score is reproducible.
Cost. No AI tokens consumed. No API rate limits hit. Run it a hundred times a day if you want.
Simplicity. The algorithm is auditable. Check file paths, match directory names, verify relationships. No black box.
Track Trends, Not Just Snapshots
A single score is useful. A trend is powerful.
Every drift computation is stored with its full breakdown. The Overview tab shows a bar chart of your score over time. Click any bar to load that historical report and see exactly what changed.
This turns drift scoring from a one-time audit into a continuous health metric. You can see:
- Did last week's refactor improve or degrade documentation accuracy?
- Is drift getting worse over time? (Spoiler: without automation, it always does.)
- Which sprint introduced the most undocumented changes?
Enforce It in CI
A metric you don't enforce is a metric you'll ignore. That's why we built a GitHub Action.
on:
push:
branches: [main]
jobs:
drift:
runs-on: ubuntu-latest
steps:
- uses: archyl-com/actions/drift-score@v1
with:
api-key: ${{ secrets.ARCHYL_API_KEY }}
organization-id: ${{ secrets.ARCHYL_ORG_ID }}
project-id: 'your-project-uuid'
threshold: '70'
Set threshold: '70' and the action fails if your architecture documentation drops below 70% accuracy. The job summary shows a formatted table with the full breakdown — visible directly in your PR checks.
You can also post the score as a PR comment:
- uses: archyl-com/actions/drift-score@v1
id: drift
with:
api-key: ${{ secrets.ARCHYL_API_KEY }}
organization-id: ${{ secrets.ARCHYL_ORG_ID }}
project-id: 'your-project-uuid'
- uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '## Architecture Drift: ' +
'${{ steps.drift.outputs.score }}%\n' +
'Matched: ${{ steps.drift.outputs.matched-count }}' +
' / ${{ steps.drift.outputs.total-elements }}'
})
Every developer sees the drift impact of their changes before merge. Architecture documentation becomes a first-class citizen in your CI pipeline — alongside tests, linting, and security scans.
MCP: AI Agents That Know Their Accuracy
If you're using Claude Code, Cursor, or any MCP-compatible AI agent with Archyl's MCP server, drift scoring is available as a tool:
compute_drift_score({ projectId: "..." })
get_drift_score({ projectId: "..." })
get_drift_history({ projectId: "..." })
get_drift_details({ scoreId: "..." })
This means an AI agent can check documentation accuracy before it starts working. The get_agent_context tool already provides the full C4 model, ADRs, and conformance rules. Now it can also check how trustworthy that documentation is.
An agent that sees a 45% drift score knows to be cautious about the architecture context it received. An agent that sees 95% can confidently rely on the documented structure. This is the foundation for self-aware AI agents that adjust their behavior based on documentation quality.
Webhook Alerts: Know When Drift Happens
Two new webhook events let you stay informed without checking dashboards:
drift.score_computed— Fires every time a drift score finishes computing. Push it to a Slack channel for visibility.drift.score_degraded— Fires when the score drops by 10 or more points from the previous computation. This is your early warning system — architecture is drifting fast.
Configure these in Archyl's webhook settings. They work with Slack, Microsoft Teams, Discord, and any generic HTTP endpoint.
The REST API
For teams that want full programmatic control:
# Trigger computation
curl -X POST https://api.archyl.com/api/v1/drift/compute \
-H "X-API-Key: $API_KEY" \
-H "X-Organization-ID: $ORG_ID" \
-H "Content-Type: application/json" \
-d '{"projectId": "your-project-uuid"}'
# Get latest score
curl https://api.archyl.com/api/v1/drift/latest?projectId=...
# Get score history
curl https://api.archyl.com/api/v1/drift/history?projectId=...&limit=20
Computation is asynchronous — the POST returns immediately with a score ID, and you poll until status becomes completed. The GitHub Action handles this automatically.
What This Means for the Agentic Era
We're entering an era where AI agents write a significant portion of production code. These agents are fast, capable, and context-blind. They don't know that the AuthService was split into IdentityService and AccessControl last month. They don't know that the frontend/ directory was renamed to web/.
The drift score creates a feedback loop:
- AI agent reads the C4 model via MCP before writing code.
- Code changes happen (by humans or agents).
- Drift score detects the gap between documentation and reality.
- CI gate prevents the gap from growing beyond a threshold.
- Webhook alerts the team when drift accelerates.
- Team updates the documentation (or runs discovery to auto-update it).
- AI agent reads the updated model. Loop closes.
Without step 3, the loop is open. Documentation drifts. Agents rely on stale context. Bad decisions compound.
The drift score closes the loop.
Getting Started
- Open any project in Archyl
- Click the heartbeat icon in the header toolbar
- Click "Compute Drift Score"
- Set up the GitHub Action for continuous monitoring
- Configure a Slack webhook for
drift.score_degradedalerts
Your architecture documentation either reflects reality or it doesn't. Now you have a number that tells you which one it is.