docs icon indicating copy to clipboard operation
docs copied to clipboard

[FR] Adding Gaussian Process Latent Variable Model Examples

Open InfProbSciX opened this issue 4 years ago • 1 comments

Summary:

I'd like to add examples of GPLVMs to the docs. A classical GPLVM (as described in Lawrence, N. D. 2003, Gaussian process latent variable models for visualisation of high dimensional data) should be quite easy to code up based on existing GP examples in Stan docs.

I've provided an example of a GPLVM model with inducing points below (using the collapsed bound, where the inducing variables are integrated out). I've coded up the lower bound that doesn't include variational parameters below (so when used with mean-field VI, the objective becomes the standard ELBO).

functions {
    real lower_bound(
            matrix K_fu,
            matrix K_uu,
            matrix Y,
            real noise_inv,
            real k_diag
        ) {
        int n = dims(Y)[1];
        int d = dims(Y)[2];
        int m = dims(K_uu)[1];

        real elbo;
        real eta;

        matrix[m, m] phi;
        matrix[n, m] psi;
        matrix[d, d] W;

        matrix[m, m] jit = diag_matrix(rep_vector(1.0, m)) * 1e-6;

        phi = K_fu' * K_fu;
        psi = K_fu;
        eta = n * k_diag;

        // Based on eqn 3.25 Damianou, Andreas (2015),
        // Deep Gaussian Processes and Variational Propagation of Uncertainty,
        // PhD thesis, University of Sheffield.

        W = (noise_inv^2) * (Y' * psi) * inverse(noise_inv*phi + K_uu + jit) * transpose(Y' * psi);

        elbo  = -0.5 * (noise_inv*trace(Y' * Y) - trace(W)) / d;
        elbo += log(noise_inv)*n*0.5 + log(determinant(K_uu + jit))*0.5;
        elbo -= log(2 * pi()) * (n/2.0);
        elbo -= log(determinant(K_uu + jit + phi*noise_inv)) / 2.0;
        elbo -= noise_inv * eta / 2.0;
        elbo += noise_inv * trace(mdivide_left_spd(K_uu + jit, phi)) / 2.0;
        return elbo * d;
    }
}
data {
    int n; int d; int q; int m;
    matrix[n, d] Y;
    vector[q] Z[m];
}
parameters {
    vector[q] X[n];
    real<lower=1e-6> prior_var;
    real<lower=1e-6> noise_var;
    real<lower=1e-6> funcn_var;
}
model {
    matrix[m, m] K_uu = gp_exp_quad_cov(Z, funcn_var, 1.0);
    matrix[n, m] K_fu = gp_exp_quad_cov(X, Z, funcn_var, 1.0);
    real k_diag = K_uu[1, 1];

    for(i in 1:q) { X[, i] ~ normal(0, sqrt(prior_var)); }
    target += lower_bound(K_fu, K_uu, Y, 1/noise_var, k_diag);
}

The data can be obtained as:

wget http://staffwww.dcs.shef.ac.uk/people/N.Lawrence/resources/3PhData.tar.gz
tar xvf 3PhData.tar.gz
rm DataTst* DataVdn* 3PhData.tar.gz

The script to run it:


library(ggplot2)
library(cmdstanr)
library(data.table)

set.seed(42)

Y = as.matrix(fread('DataTrn.txt'))
labels = as.character(as.matrix(fread('DataTrnLbls.txt')) %*% 1:3)

n = nrow(Y); q=2; d=ncol(Y); m=36
X = matrix(rnorm(n*q), n, q)
Z = matrix(rnorm(m*q), m, q) # inducing inputs

model = cmdstan_model('model.stan', cpp_options=list(stan_opencl=TRUE))

results = model$variational(
    data=list(n=n, q=q, m=m, d=d, Y=Y, Z=Z),
    seed=42,
    iter=2000,
    opencl_ids=c(0, 0),
    algorithm='meanfield',
    eta=1,
    adapt_engaged=F
)

X_recon = results$summary('X')[c('variable', 'mean')]
X_recon = matrix(X_recon$mean, n, q, byrow=F)

ggplot(data=data.frame(x=X_recon[, 1], y=X_recon[, 2], lb=labels)) +
    geom_point(aes(x=x, y=y, color=lb), alpha=0.2)

The latent variables should be as follows.

Screenshot 2021-11-21 at 16 01 12

InfProbSciX avatar Nov 21 '21 16:11 InfProbSciX

This is an interesting model. Can you make an argument for this being in the docs vs a case study?

spinkney avatar Jan 05 '22 22:01 spinkney