Toy Models

The goal is not a full theorem suite yet. The goal is to isolate the smallest settings where the identifiability distinction becomes obvious.

Two-Strategy Mixture

Let z in {z_1, z_2} denote two strategies for a fixed task x.

Questions:

When is the decomposition recoverable up to swapping the two labels?
When can a predictive latent still fail to correspond to the intended strategy split?

Useful abstraction:

task-specific success depends on which strategy is used,
but the same latent label might mean something different for different x unless an additional consistency condition is imposed.

Sparse Multi-Strategy Family

Let each x admit only a subset of strategies as meaningfully distinct.

Questions:

Can the model identify the active strategy subset for each x?
Can it align strategy labels across problems that share the same strategy family?
What happens when the same z is reused for different tasks with different semantics?

This model should help separate task-wise identifiability from global consistency.

Informative-Prefix / Teacher-Forcing Abstraction

Use a simplified sequence model where only an early prefix is strategy-sensitive and the remainder becomes nearly deterministic once the prefix is observed.

Questions:

does local usefulness of z only identify the prefix-level split?
does that imply anything about global semantics across examples?
can a latent be identifiable on the informative prefix while still being globally inconsistent?

Cross-Task Permutation Ambiguity

Construct examples where two tasks each have identifiable latent decompositions, but the latent labels are permuted differently across tasks.

This is the cleanest way to show:

local/task-wise identifiability can hold,
while global-consistent identifiability fails.

Next: Proof sketches