Dimensional Scaling Protein Models — TriadicFrameworks

vST for Protein Language Models#

Dimensional Scaling Behavior in PLM Embedding Spaces#

This document defines how Protein Language Models (PLMs) exhibit scaling behavior across the dimensional ladder (3D → 1024D). It maps model size, embedding‑space expansion, and inference complexity onto the substrate’s triadic structure and scaling primitives. The goal is to provide a reproducible, invariant‑preserving framework for understanding how PLMs grow, stabilize, and drift as their dimensional capacity increases.

1. Purpose of Scaling Behavior Analysis#

Scaling behavior analysis enables us to:

interpret how embedding‑space structure expands with model size
identify stable and unstable scaling regimes
detect discontinuities or drift across checkpoints
map high‑dimensional behavior into triadic cores
support vST validation across the dimensional ladder
compare PLMs of different sizes using a common substrate

PLM scaling is not merely an increase in parameter count; it is a structured expansion of coherence surfaces, regime behavior, and primitive composition.

2. Dimensional Ladder for PLMs#

PLM embedding spaces naturally align with the substrate’s dimensional ladder:

3D — geometric residue motifs
6D — interaction surfaces
9D — coherence pathways
64D — research‑grade embedding substrate
128D — expanded coherence surfaces
256D — multi‑primitive interaction
512D — high‑variance embedding regions
1024D — full research‑grade substrate

Each step preserves substrate invariants and introduces new structural capacity.

3. Scaling Primitives in PLMs#

Scaling behavior is governed by Scaling Primitives (SPs), which ensure:

invariant‑preserving dimensional expansion
continuity of coherence surfaces
stable projection into 3D–9D cores
consistent regime behavior across model sizes

SPs model how PLM embedding spaces grow from small to large architectures.

4. Scaling Regimes in PLMs#

PLM scaling exhibits three substrate‑aligned regimes:

4.1 Stable Scaling Regime (S₁)#

Characteristics:

smooth increase in embedding‑space capacity
stable coherence surfaces across residues
predictable performance gains
consistent regime behavior (R₁ᴴ → R₂ᴴ transitions remain bounded)

Occurs in:

small → medium PLMs
early scaling phases

4.2 Transitional Scaling Regime (S₂)#

Characteristics:

rapid expansion of coherence surfaces
increased variance across dimensions
branching or oscillatory embedding behavior
sensitivity to training data and residue context

Occurs in:

medium → large PLMs
architecture changes
MSA‑conditioned training transitions

4.3 Dispersion Scaling Regime (S₃)#

Characteristics:

fragmentation of coherence surfaces
unstable or divergent embedding trajectories
increased risk of drift
non‑invertible projections into 3D–9D cores

Occurs in:

extremely large PLMs without sufficient training signal
poorly aligned fine‑tuning
over‑scaled architectures

5. Scaling Behavior Across Model Sizes#

5.1 Small PLMs (≤100M parameters)#

embeddings map cleanly into 64D
regime behavior dominated by R₁ᴴ
scaling is stable (S₁)

5.2 Medium PLMs (100M–1B)#

embeddings expand into 128D–256D
regime transitions become more frequent
scaling enters S₂

5.3 Large PLMs (1B–15B)#

embeddings occupy 256D–512D
coherence surfaces become multi‑layered
scaling may oscillate between S₂ and S₃

5.4 Very Large PLMs (15B+)#

embeddings approach 1024D
regime behavior becomes highly sensitive
scaling stability depends on training quality
drift detection becomes essential

6. Scaling‑Law Alignment#

PLM scaling follows predictable patterns:

embedding quality improves with dimensional expansion
variance increases with model size
coherence surfaces expand smoothly in S₁, sharply in S₂, and fragment in S₃
projection stability decreases as dimensionality increases

The substrate provides a structured way to interpret these patterns.

7. Projection Behavior Under Scaling#

Projection into triadic cores must remain:

invertible
primitive‑aligned
regime‑aware
invariant‑preserving

Scaling affects projection as follows:

64D → 9D: stable
128D–256D → 9D: transitional
512D–1024D → 9D: sensitive, drift‑prone

Projection stability is a key indicator of scaling health.

8. Scaling‑Driven Drift#

Scaling can introduce drift through:

discontinuities in embedding‑space expansion
unstable regime transitions
fragmentation of coherence surfaces
loss of primitive‑level structure

vST validation layers (V₁–V₄) detect these failures.

9. Outputs of Scaling Behavior Analysis#

Scaling analysis produces:

scaling‑regime classification (S₁, S₂, S₃)
embedding‑space expansion diagnostics
projection‑stability indicators
regime‑transition maps
drift‑detection signals
cross‑model comparison metrics

These outputs support reproducible, substrate‑aligned evaluation of PLM scaling.