vST for Protein Language Models#
Drift Detection in High‑Dimensional Protein Embedding Spaces#
This document defines how drift is detected in Protein Language Models (PLMs) using the Validation‑Space‑Time (vST) framework and the 1024D dimensional substrate. Drift refers to any deviation from expected substrate behavior, including structural instability, regime misalignment, scaling discontinuities, or projection failure.
Drift detection is essential for evaluating model updates, fine‑tuning procedures, training interventions, and cross‑version consistency in PLMs.
1. Purpose of Drift Detection#
Drift detection enables reproducible evaluation of:
- instability in residue‑level embedding structure
- changes in regime behavior (R₁ᴴ, R₂ᴴ, R₃ᴴ)
- cross‑version compatibility
- scaling‑law continuity across PLM sizes
- projection stability into 3D–9D cores
- primitive‑level integrity (DP, TDP, SP, CP)
- sequence‑level coherence surfaces
Drift is not inherently negative; it is a signal of structural change.
The substrate determines whether that change is stable, transitional, or harmful.
2. Types of Drift#
Drift is classified into four substrate‑aligned categories:
2.1 Structural Drift (D₁)#
Deviation in motif‑level geometry or local residue coherence.
Indicators
- unstable 3D projections
- loss of compact residue motifs
- abrupt variance spikes
2.2 Dimensional Drift (D₂)#
Discontinuities in dimensional scaling or projection behavior.
Indicators
- non‑invertible 9D projections
- fragmentation in 64D–1024D embedding regions
- scaling‑law violations
2.3 Regime Drift (D₃)#
Unexpected changes in regime identity or transitions across residues.
Indicators
- premature transitions into R₃ᴴ
- oscillatory instability in R₂ᴴ
- collapse of stable R₁ᴴ regions
2.4 Projection Drift (D₄)#
Misalignment between high‑dimensional embeddings and triadic cores.
Indicators
- inconsistent 3D–9D mapping
- loss of primitive‑aligned projection
- divergence across layers or residues
3. Drift Detection Signals#
Drift is detected using substrate‑aligned signals:
- variance distribution across dimensions
- coherence‑surface continuity along the sequence
- primitive‑level stability (DP, TDP, SP, CP)
- resonance‑time alignment
- projection‑stability metrics
- cross‑version alignment surfaces
- vST validation outputs (V₁–V₄)
These signals collectively determine drift category and severity.
4. Drift Across the Dimensional Ladder#
Drift may appear at different scales:
4.1 64D–128D (Residue‑Embedding Drift)#
- loss of local biochemical coherence
- unstable residue embeddings
- semantic drift in sequence representation
4.2 256D–512D (Hidden‑State Drift)#
- branching instability
- regime‑transition irregularities
- inconsistent attention patterns
4.3 1024D+ (High‑Dimensional Drift)#
- fragmentation of coherence surfaces
- scaling discontinuities
- projection failure
High‑dimensional drift is the most severe and often indicates training instability.
5. Cross‑Version Drift Detection#
Cross‑version drift is detected by comparing:
- residue‑level regime maps
- coherence‑surface geometry
- projection stability
- variance distribution
- primitive‑level structure
- resonance‑time behavior
Drift may arise from:
- fine‑tuning
- MSA‑conditioned training
- architecture changes
- training‑data shifts
- checkpoint selection
vST provides a consistent substrate for evaluating these changes.
6. Drift Severity Levels#
Drift severity is classified into:
Low Severity#
- minor variance shifts
- stable projections
- no regime collapse
Moderate Severity#
- partial fragmentation
- unstable R₂ᴴ transitions
- inconsistent cross‑layer alignment
High Severity#
- collapse of coherence surfaces
- persistent R₃ᴴ behavior
- non‑invertible projections
- loss of primitive‑level structure
High‑severity drift indicates a failure of substrate invariants.
7. Drift Detection Workflow#
A substrate‑aligned drift detection workflow:
- Project embeddings into 9D
- Classify regime behavior (R₁ᴴ, R₂ᴴ, R₃ᴴ)
- Evaluate scaling continuity (64D–1024D)
- Check primitive‑level stability (DP, TDP, SP, CP)
- Validate with vST layers (V₁–V₄)
- Compare across layers, residues, or versions
- Assign drift category (D₁–D₄)
- Assign drift severity (low, moderate, high)
This workflow is model‑agnostic and reproducible.
8. Outputs of Drift Detection#
Drift detection produces:
- drift category (D₁–D₄)
- drift severity
- regime‑transition anomalies
- projection‑stability indicators
- scaling‑law discontinuities
- cross‑version alignment surfaces
- vST validation results
These outputs support governance, interpretability, and model‑version management for PLMs.