continuous_cvae-Apr1-7loss_20260401

`continuous_cvae-Apr1-7loss_20260401`

Summary

This report includes figures associated with the continuous_cvae-Apr1-7loss_20260401 experiment collection.

Planned design: 6 tasks x 7 loss settings x 3 adaptation modes x 1 seed = 126 jobs.

Observed completion:

completed runs: 120/126
missing runs: 6

Important Caveat. In this Apr1 collection, token-weighted reconstruction was implemented incorrectly (uniform token weights instead of reference-model token-level NLL weighting). Therefore, twr1 settings should not be interpreted as true token-weighted reconstruction and should be deprioritized for mechanism conclusions.

Loss Settings Legend

The plots use compact loss tags. This legend maps full setting names to plot tags.

beta0p1_norm_pv0_twr0 -> b0p1_norm_pv0_twr0: beta 0.1, normalized reconstruction, posterior variance weight 0.0, token-weighted reconstruction weight 0.0.
beta0p1_base_pv0_twr0 -> b0p1_base_pv0_twr0: beta 0.1, unnormalized reconstruction, posterior variance weight 0.0, token-weighted reconstruction weight 0.0.
beta0_norm_pv0_twr0 -> b0_norm_pv0_twr0: beta 0.0, normalized reconstruction, posterior variance weight 0.0, token-weighted reconstruction weight 0.0.
beta1_norm_pv0_twr0 -> b1_norm_pv0_twr0: beta 1.0, normalized reconstruction, posterior variance weight 0.0, token-weighted reconstruction weight 0.0.
beta0p1_norm_pv0p1_twr0 -> b0p1_norm_pv0p1_twr0: beta 0.1, normalized reconstruction, posterior variance weight 0.1, token-weighted reconstruction weight 0.0.
beta0p1_norm_pv0_twr1 -> b0p1_norm_pv0_twr1: beta 0.1, normalized reconstruction, posterior variance weight 0.0, token-weighted reconstruction weight 1.0 (interpret with caveat above).
beta0p1_norm_pv0p1_twr1 -> b0p1_norm_pv0p1_twr1: beta 0.1, normalized reconstruction, posterior variance weight 0.1, token-weighted reconstruction weight 1.0 (interpret with caveat above).

Key Metrics:

Final Answer Accuracy: Rate at which generated response $y$ via $y ~ p(y | x, z), z ~ p(z | x)$ are "correct" in the sense of matching some ground truth strategy.
Router Probe Accuracy: Validation accuracy of a linear probe trained to predict the strategy of generated output $y ~ p(y | x, z)$ $y p (y ∣ x, z)$ from router-sampled $z ~ p(z | x)$ $z p (z ∣ x)$ . This represents the degree to which the router's latent space encodes strategy information (in a linearly decodable way).
- Can also consider posterior probe accuracy where $z$ is sampled from the posterior $q(z | x, y)$ instead of the router prior $p(z | x)$ and we similarly measure whether it linearly encodes strategy information.
Analogical Agreement Rate: Sample a pair of inputs $(x_1, x_2)$ and sample a latent variable $z ~ p(z | x_1)$ from the first via the router. Generate $y_1 ~ p(y | x_1, z)$ and $y_2 ~ p(y | x_2, z)$ . The pair is an "analogical agreement" if $y_1$ and $y_2$ share the same strategy. The analogical agreement rate is the proportion of sampled pairs that are analogical agreements. This represents the degree to which the router's latent space captures strategy in a way that generalizes across inputs and enables transfer.

Key Figures

Caption: Coverage heatmap across task, loss setting, and adaptation mode. Most cells are complete, with a small number of missing combinations.

Caption: Final answer accuracy heatmap by task and loss setting, split by adaptation mode. Accuracy is broadly high with a few localized low settings.

Router linear probe accuracy heatmap by task and loss setting, split by adaptation mode. Caption: Router linear probe accuracy, meaning linear predictability of generated-output strategy from router-sampled latents, by task and loss setting. Probe quality varies more strongly than final accuracy, and the balance between loss terms appears important.

Caption: Task-level router probe distribution on the left and best configuration per task on the right. No task is uniformly low under all configurations.

Caption: Router probe versus posterior probe, faceted by adaptation mode with an y=x reference. Posterior probe is often at or above router probe.

Normalization contrast plot for paired comparisons. Caption: Effects of whether reconstruction loss is normalized by the entangled baseline loss. Normalization appears to improve router probe accuracy.

Analogical agreement rate heatmap by task and loss setting. Caption: Analogical agreement rate (pair_set_overlap_rate) by task and loss setting, split by adaptation mode. Interpret twr1 rows with caution due to the token-weighting caveat above.

Caption: Router probe versus analogical agreement, faceted by task and adaptation mode. The association is strongly positive overall.