Context
Context
Problem Statement
We want to understand when a disentangled latent factorization of the form
is not only useful, but identifiable in the sense that latent values correspond to stable high-level strategies.
The central tension is:
- a latent can be predictive and useful without meaning the same thing across examples,
- but the research goal is for
zto represent a strategy that is consistent enough to be interpreted, intervened on, and compared across inputs.
Why This Matters
The disentanglement notes already motivate a CVAE-style recovery procedure, but
they also expose the main obstacle: multiple predictive factorizations can
explain the same p(y|x). A model can therefore achieve good reconstruction and
still fail to recover semantically stable strategy variables.
This makes identifiability a distinct question from optimization success. Learnability asks whether training can find a good factorization; identifiability asks whether the target factorization is uniquely recoverable, up to the expected symmetries.
Relevant Context
- methodology/cvae-disentangling
- methodology/concepts/token_weighted_excess_reconstruction
- methodology/cvae-objective-and-losses
path:.tasks/doing/2026-03-15-theory-latent-space-exploration/spec.md- ideation/concepts/abstract-strategy-vs-solution
Working Hypothesis
The useful notion may not be a single identifiability definition. It may instead be a hierarchy:
- task-wise identifiability for a fixed
x, - problem-wise identifiability across a task family,
- globally consistent identifiability across many
x.
The theory thread should clarify which of these is attainable under the current modeling assumptions and which metrics can detect the difference.