CVAE Parameterization
CVAE Parameterization
This note isolates the parameterization choices that are methodology-relevant rather than loss-specific.
General Parameterization
Let denote the base model parameters. In the most general case, the three components have separate parameter sets:
Conceptually:
- parameterizes the strategy router ,
- parameterizes the strategy-conditioned decoder ,
- parameterizes the inference network .
For small models, one valid implementation is to update all parameters of each component directly.
LoRA Parameterization
When parameter-efficient adaptation is preferred, each component may be written as a LoRA update of the base backbone:
Trainable parameters are then the low-rank adapters and small heads:
- and the router head,
- and the decoder conditioning components,
- and the inference head.
Backbone weights in remain frozen.
Rank , scale , dropout, and target modules are implementation knobs rather than method-defining commitments.
Unified Shared-Parameter Variant
For discrete latent strategies, one alternative is to share one parameter set between the router and the strategy-conditioned generator, while keeping a separate inference network.
Shared parameters:
Separate inference network:
Under this parameterization, the same causal LM can expose both a routing interface and a generation interface.