Tracking context window usage in real time

The most common support complaint we saw during early testing wasn't about bugs or performance. It was a variation of the same sentence: 'The AI forgot what I told it.' Sometimes this means the AI gave a wrong answer. More often it means the AI stopped referencing a file it was using thirty minutes ago — and started hallucinating an API that doesn't exist in your codebase.

The cause is almost always the same: a file silently dropped out of the AI's context window. The context window is finite. As the conversation grows, older content gets evicted. The AI doesn't tell you when this happens. It just starts getting things wrong.

We built a real-time context monitor to make this visible before it causes problems.

Token estimation

The first challenge: we don't have access to the AI's actual tokenizer in real time. Running a full tokenization pass on every piece of text would be both slow and model-specific. Instead, we use a calibrated estimate: approximately 3.5 characters per token. This is accurate to within ~10% for English-heavy code and closer to ~15% for dense JSON or minified code — good enough for a live gauge.

We accumulate token estimates across the conversation: the initial system prompt, each user message, each AI response, and each tool call input and output. The running total gives us a real-time estimate of context window utilization. When we have actual token counts from hooks — for Claude Code — we use those instead and calibrate the estimator against them.

File relevance scoring

Beyond the aggregate token count, we track which specific files the AI has referenced. Every time a file path appears in a tool call input (read_file, edit_file, search_files) or in an AI response, we increment that file's reference count and update its last-seen timestamp. Relevance decays over time using an exponential decay function with a 20-minute half-life.

typescript

// src/main/engines/context-tracker.ts
function computeRelevance(entry: FileEntry, now: number): number {
  const ageMs = now - entry.lastSeenAt;
  const halfLifeMs = 20 * 60 * 1000; // 20 minutes
  const decayFactor = Math.exp(-0.693 * (ageMs / halfLifeMs));
  const referenceBoost = Math.log1p(entry.referenceCount) * 0.3;
  return Math.min(1.0, decayFactor + referenceBoost);
}

function getRelevanceLevel(score: number): "HIGH" | "MED" | "LOW" | "DROP" {
  if (score >= 0.7) return "HIGH";
  if (score >= 0.4) return "MED";
  if (score >= 0.15) return "LOW";
  return "DROP";
}

The decay model reflects reality: a file the AI read 30 minutes ago is less likely to still be in its active context than one it read 2 minutes ago. Files with high reference counts get a log-scaled boost — the AI returning to a file repeatedly is a strong signal it's important.

Drop detection and alerts

When a file's relevance score falls below the DROP threshold (0.15), we fire a context:drop event. The context monitor panel shows a visual alert: the file row turns amber, and a 'Dropped' badge appears. We also push a notification to the session dashboard so you see it even if you're looking at a different panel.

Drop alerts fire prospectively, not retroactively. By the time you see the alert, the file hasn't necessarily been evicted yet — you have a window to re-inject it before the AI's next response.

The context gauge

The visual representation is a circular SVG gauge: a thin arc from 0% to 100% context utilization, with a color gradient that shifts from cyan (0–60%) to amber (60–85%) to red (85–100%). The percentage in the center updates in real time as each trace arrives. At 85%, we show a warning banner: 'Context window 85% full. Consider starting a new session.'

Below the gauge is a per-file breakdown: each file tracked in this session appears as a row showing the filename, estimated token count, and a relevance level badge (HIGH / MED / LOW / DROP). Files are sorted by relevance descending, so the most at-risk files float to the bottom.

Re-injection

When a file shows a DROP badge, there's a one-click 'Re-inject' button that sends the file contents back to the AI with an instruction to re-read it. For Claude Code, this generates an add_to_context tool call. For other tools, we write a message to stdin: Please re-read {filepath} — it may have dropped from your context. Not elegant, but effective.

Persistence and session restore

Context state is persisted to SQLite on every update. When you switch sessions, the context monitor instantly loads the saved state for the target session — you see exactly what files were relevant and how full the context window was when you left. For historical sessions, this lets you reconstruct why the AI started making mistakes near the end.

The result: context window exhaustion goes from an invisible failure mode to a visible, manageable signal. You stop getting surprised by hallucinations from dropped context, because you can see them coming.

Tracking context window usage in real time

Token estimation

File relevance scoring

Drop detection and alerts

The context gauge

Re-injection

Persistence and session restore

Related posts

Why we built a persistent terminal daemon

How much does Claude Code actually cost? I tracked 30 days of sessions.

The trace model that unifies 5 AI tools

Subscribe to updates