Context

Problem Statement

We want to understand when a disentangled latent factorization of the form

p(y \mid x) = \sum_z p(y \mid x, z) p(z \mid x)

is not only useful, but identifiable in the sense that latent values correspond to stable high-level strategies.

The central tension is:

a latent can be predictive and useful without meaning the same thing across examples,
but the research goal is for z to represent a strategy that is consistent enough to be interpreted, intervened on, and compared across inputs.

Why This Matters

The disentanglement notes already motivate a CVAE-style recovery procedure, but they also expose the main obstacle: multiple predictive factorizations can explain the same p(y|x). A model can therefore achieve good reconstruction and still fail to recover semantically stable strategy variables.

This makes identifiability a distinct question from optimization success. Learnability asks whether training can find a good factorization; identifiability asks whether the target factorization is uniquely recoverable, up to the expected symmetries.

Relevant Context

methodology/cvae-disentangling
methodology/concepts/token_weighted_excess_reconstruction
methodology/cvae-objective-and-losses
path:.tasks/doing/2026-03-15-theory-latent-space-exploration/spec.md
ideation/concepts/abstract-strategy-vs-solution

Working Hypothesis

The useful notion may not be a single identifiability definition. It may instead be a hierarchy:

task-wise identifiability for a fixed x,
problem-wise identifiability across a task family,
globally consistent identifiability across many x.

The theory thread should clarify which of these is attainable under the current modeling assumptions and which metrics can detect the difference.

Next: Core questions