Identifiability and Learnability of Disentangled Latent Strategies

This workspace is the persistent scratchpad for the theory thread on when a disentangled latent strategy factorization is actually identifiable or learnable.

The key issue is not just whether a latent z helps predict y, but whether the same z can be interpreted as the same high-level strategy across examples. That distinction is central to deciding whether the model has learned strategy semantics or only a locally useful predictive partition.

Current Framing

This thread is related to the disentanglement methodology notes and to the active exploration of why latent usefulness does not automatically imply global strategy semantics.

Relevant context:

Update Rules

  • Keep the core question explicit: learnability versus identifiability.
  • Separate task-wise recovery from globally consistent recovery.
  • Prefer definitions that can later be tied to empirical diagnostics in the synthetic exp2 setting.
  • Put theorem-level ideas in 04-proof-sketches.md only when they clarify the definitions, not as a substitute for defining the problem.

Next

Next: Context

Built with LogoFlowershow