probability icon indicating copy to clipboard operation
probability copied to clipboard

Feature Request: Implement trainable probability vectors for mixture distributions

Open prasanthcakewalk opened this issue 4 years ago • 1 comments

Let's say we have a mixture model of the form P(x) = w1 P1(x) + w2 P2(x) + ... where the wi-s add up to 1.

Right now,

  • We can create trainable distributions P1, P2, ..., e.g. using bijector-based networks.
  • We can combine them into a mixture easily using
mixture_dist = tfp.Mixture(
    cat = tfd.Categorical(probs=[w1, w2,...]),
    components = [P1, P2, ...]
)

and fit the model to data. Making the wi-s trainable, however, is bit complicated.

The way I got it to work was to create a custom model TrainableProbVector with one layer of trainable parameters. The model ignores its input and simply outputs the softmax of the parameters. But since keras models cannot not have inputs, it required some hacky coding to create a distribution that could both be trained, and be used like a regular distribution post-training.

Being able to create layers and/or models which don't have inputs will make this easier. A solution specific to tfp.Categorical and/or tfp.Mixture will also be great.

Thanks!

prasanthcakewalk avatar Mar 02 '21 03:03 prasanthcakewalk

As a hack you can make a constant input and pass it to a dense layer with softmax activation that will provide these probabilities.

Strategy24 avatar Feb 14 '24 17:02 Strategy24