API Documentation¶

Gaussian Proposal¶

class gmmmc.proposals.gaussian_proposals.GaussianStepCovarProposal(step_sizes=(0.001, ))¶

Bases: gmmmc.proposals.proposals.Proposal

Methods

`get_acceptance`()	Calculate and return the acceptance rate of the proposal.
`get_illegal`()	Calculate and return the illegal proposal rate of this proposal.
`propose`(X, gmm, target[, n_jobs])	Propose a new set of GMM covariances (diagonal only).

propose(X, gmm, target, n_jobs=1)¶

Propose a new set of GMM covariances (diagonal only).

Parameters:

X : 2-D array like of shape (n_samples, n_features)

The observed data or evidence.

gmm : GMM object

The current state (set of gmm parameters) in the Markov Chain

target : GMMPosteriorTarget object

The target distribution to be found.

n_jobs : int

Number of cpu cores to use in the calculation of log probabilities.

Returns:

: GMM

A new GMM object initialised with new covariance parameters.

class gmmmc.proposals.gaussian_proposals.GaussianStepMeansProposal(step_sizes=(0.001, ))¶

Bases: gmmmc.proposals.proposals.Proposal

Gaussian Proposal distribution for means of a GMM

Methods

`get_acceptance`()	Calculate and return the acceptance rate of the proposal.
`get_illegal`()	Calculate and return the illegal proposal rate of this proposal.
`propose`(X, gmm, target[, n_jobs])	Propose a new set of GMM means.

propose(X, gmm, target, n_jobs=1)¶

Propose a new set of GMM means.

Parameters:

X : 2-D array like of shape (n_samples, n_features)

The observed data or evidence.

gmm : GMM object

The current state (set of gmm parameters) in the Markov Chain

target : GMMPosteriorTarget object

The target distribution to be found.

n_jobs : int

Number of cpu cores to use in the calculation of log probabilities.

Returns:

: GMM

A new GMM object initialised with new mean parameters.

class gmmmc.proposals.gaussian_proposals.GaussianStepWeightsProposal(n_mixtures, step_sizes=(0.001, ), threshold=0.001)¶

Bases: gmmmc.proposals.proposals.Proposal

Methods

`get_acceptance`()	Calculate and return the acceptance rate of the proposal.
`get_illegal`()	Calculate and return the illegal proposal rate of this proposal.
`invTransformSimplex`(simplex_coords)	Transforms a point on the simplex to the original vector space.
`propose`(X, gmm, target[, n_jobs])	Propose a new set of weight vectors.
`transformSimplex`(weights)	Project weight vector onto the normal simplex.

invTransformSimplex(simplex_coords)¶

Transforms a point on the simplex to the original vector space.

Parameters:

simplex_coords : array_like of shape (n_mixtures - 1,)

Coordinates of a weight vector on the simplex.

Returns:

: array_like of shape(n_mixtures,)

vector of weights.

propose(X, gmm, target, n_jobs=1)¶

Propose a new set of weight vectors. Parameters ———- X : 2-D array like of shape (n_samples, n_features)

The observed data or evidence.

gmm : GMM object: The current state (set of gmm parameters) in the Markov Chain
target : GMMPosteriorTarget object: The target distribution to be found.
n_jobs : int: Number of cpu cores to use in the calculation of log probabilities.

: GMM A new GMM object initialised with new covariance parameters.

transformSimplex(weights)¶

Project weight vector onto the normal simplex.

Parameters:

weights : array_like of shape (n_mixtures,)

vector of weights for each gaussian component

Returns:

: array_like of shape (n_mixtures-1,)

vector of weights projected onto the simplex plane

class gmmmc.proposals.gaussian_proposals.GaussianTuningStepMeansProposal(step_sizes=(0.001, ), limit=200)¶

Bases: gmmmc.proposals.proposals.Proposal

Gaussian Proposal distribution for means of a GMM

Methods

`get_acceptance`()	Calculate and return the acceptance rate of the proposal.
`get_illegal`()	Calculate and return the illegal proposal rate of this proposal.
`propose`(X, gmm, target[, n_jobs])	Propose a new set of GMM means.

propose(X, gmm, target, n_jobs=1)¶

Propose a new set of GMM means.

Parameters:

X : 2-D array like of shape (n_samples, n_features)

The observed data or evidence.

gmm : GMM object

The current state (set of gmm parameters) in the Markov Chain

target : GMMPosteriorTarget object

The target distribution to be found.

n_jobs : int

Number of cpu cores to use in the calculation of log probabilities.

Returns:

: GMM

A new GMM object initialised with new mean parameters.

Generic Proposals¶

class gmmmc.proposals.proposals.GMMBlockMetropolisProposal(propose_mean=None, propose_covars=None, propose_weights=None, propose_iterations=1)¶

Bases: gmmmc.proposals.proposals.Proposal

Methods

`get_acceptance`()	Calculate and return the acceptance rate of the proposal.
`get_illegal`()	Calculate and return the illegal proposal rate of this proposal.
`propose`(X, gmm, target[, n_jobs])	Propose a new set of gmm parameters.

propose(X, gmm, target, n_jobs=1)¶

Propose a new set of gmm parameters. Calls each proposal function one after another.

Parameters:

X : 2-D array_like of shape (n_samples, n_features)

Feature vectors

gmm : GMM object

Current GMM parameters in the markov chain

target : GMMPosteriorTarget object

Target distribution

n_jobs : int

Number of cpus to use. -1 to use all available cores.

Returns:

: GMM Object

The next state in the Markov Chain.

class gmmmc.proposals.proposals.Proposal¶

Bases: object

Methods

get_acceptance() Calculate and return the acceptance rate of the proposal.

get_illegal() Calculate and return the illegal proposal rate of this proposal.

propose(X, gmm, target[, n_jobs])

Parameters:

get_acceptance()¶

Calculate and return the acceptance rate of the proposal.

Returns:

: double

The acceptance rate of the proposal function

get_illegal()¶

Calculate and return the illegal proposal rate of this proposal. (Proposing values outside the support of the parameter space e.g covariances < 0)

Returns:

: double

The illegal proposal rate of the proposal function

propose(X, gmm, target, n_jobs=1)¶

Parameters:

X : 2-D array_like

Observed data or evidence.

gmm : GMM object

target

n_jobs

Priors¶

class gmmmc.priors.prior.CovarsStaticPrior(prior_covars)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(covariances)	Log probability of a covariance given we know what its true value ‘should’ be.
`log_prob_single`(covariance, mixture_num)	Log probability of a covariance matrix of a single mixture given we know what its true value should be.
`sample`()

log_prob(covariances)¶

Log probability of a covariance given we know what its true value ‘should’ be.

Parameters:

covariances : covariance matrices of GMM distribution

Returns:

: double

0 if covariances are identical to their true values, -inf otherwise.

log_prob_single(covariance, mixture_num)¶

Log probability of a covariance matrix of a single mixture given we know what its true value should be.

Parameters:

covariance : 1-D array_like of shape (n_features)

covariance matrix for a specific mixture in GMM

mixture_num : int

Index of mixture in GMM

Returns:

: double

0 if covariance is identical to its true value, -inf otherwise

sample()¶

class gmmmc.priors.prior.DiagCovarsUniformPrior(low, high, n_mixtures, n_features)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(covars)	Compute the log prior probability of the covariances of a GMM.
`log_prob_single`(covar, mixture_num)	Compute the log probability of the covariances for a specific mixture.
`sample`()	Draw a sample for the diagonals of a covariance matrix assuming a uniform prior over each individual element.

log_prob(covars)¶

Compute the log prior probability of the covariances of a GMM. Since this will be used for monte carlo simulations, we care only that it is proportional to the true probability. For a uniform distribution we can use any value.

Parameters:

covars : 2-D array_like, of shape (n_mixtures, n_features)

covariance vectors for the GMM

Returns:

: double

Proportional to the log probability of the covariance of the GMM. 0.0 if the means are within the bounds of the uniform prior, -inf otherwise.eans

log_prob_single(covar, mixture_num)¶

Compute the log probability of the covariances for a specific mixture.

Parameters:

covar : 1-D array_like of length n_features

Single diagonal covariance from a single mixture of the GMM.

mixture_num : int

Index of the mixture for the covariance matrix.

Returns:

double

Proportional to the log prior probability for the covariance. 0.0 if the means are wimeansthin the bounds of the uniform prior, -inf otherwise.

sample()¶

Draw a sample for the diagonals of a covariance matrix assuming a uniform prior over each individual element.

Returns:

: 2-D array_like of shape (n_mixtures, n_features)

array of diagonal covariances for a GMM.

class gmmmc.priors.prior.DiagCovarsWishartPrior(df, scale_matrices)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(covars)	Compute the log prior probability of the covariances of a GMM.
`log_prob_single`(covar, mixture_num)	Compute the log probability of the means for a specific mixture.
`sample`()	Draw a sample from the inverse wishart prior distribution.

log_prob(covars)¶

Compute the log prior probability of the covariances of a GMM. Since this will be used for monte carlo simulations, we care only that it is proportional to the true probability.

Parameters:

covars : 2-D array_like, of shape (n_mixtures, n_features)

covariance vectors for the GMM

Returns:

: double

Proportional to the log probability of the covariance of the GMM. w.r.t an inverse wishart distribution

log_prob_single(covar, mixture_num)¶

Compute the log probability of the means for a specific mixture.

Parameters:

mean : 1-D array_like of length n_features

Single mean vector from a single mixture of the GMM.

mixture_num : int

Index of the mixture for the mean.

Returns:

: double

Proportional to the log prior probability for means

sample()¶

Draw a sample from the inverse wishart prior distribution.

Returns:

: 2-D array_like of shape (n_mixtures, n_features)

Return a complete set of mean vectors for a GMM

class gmmmc.priors.prior.GMMParameterPrior¶

Methods

`log_prob`(params)	Calculate the log prior probability.
`log_prob_single`(param, mixture_num)	Compute log probability of a single set of parameters (mean/covariance/weight) vector

log_prob(params)¶

Calculate the log prior probability.

Parameters:

params : array_like of varying shape

parameters for a GMM e.g means weights covariances

Returns:

: double

log prior probability of the parameters

log_prob_single(param, mixture_num)¶

Compute log probability of a single set of parameters (mean/covariance/weight) vector

Parameters:

param : 1-D array_like vector of parameters for a single mixture

mixture_num : Mixture index for the parameters.

Returns:

: double

log prior probability of parameters

class gmmmc.priors.prior.GMMPrior(means_prior, covars_prior, weights_prior)¶

Methods

log_prob(gmm)

Parameters:

sample() Compute a sample from the prior distribution of the GMM’s parameters.

log_prob(gmm)¶

Parameters:

gmm : GMM

object containing parameters

Returns:

: double

log prior probability of the parameters of the input GMM

sample()¶

Compute a sample from the prior distribution of the GMM’s parameters.

Returns:

: GMM

GMM object containing the sampled parameters

class gmmmc.priors.prior.MeansGaussianPrior(prior_means, covariances)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(means)	Compute the log prior probability of the means of a GMM according to Gaussian priors.
`log_prob_single`(mean, mixture_num)	Compute the log probability of the means for a specific mixture.
`sample`()	Draw a sample from the Gaussian prior distributions of the mean vectors.

log_prob(means)¶

Compute the log prior probability of the means of a GMM according to Gaussian priors.

Parameters:

means : 2-D array_like, of shape (n_mixtures, n_features)

mean vectors for the GMM

Returns:

: double

Proportional to the log probability of the means of the GMM

log_prob_single(mean, mixture_num)¶

Compute the log probability of the means for a specific mixture.

Parameters:

mean : 1-D array_like of length n_features

Single mean vector from a single mixture of the GMM.

mixture_num : int

Index of the mixture for the mean.

Returns:

: double

Proportional to the log prior probability for means

sample()¶

Draw a sample from the Gaussian prior distributions of the mean vectors.

Returns:

: 2-D array_like of shape (n_mixtures, n_features)

Return a complete set of mean vectors for a GMM

class gmmmc.priors.prior.MeansUniformPrior(low, high, n_mixtures, n_features)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(means)	Compute the log prior probability of the means of a GMM.
`log_prob_single`(mean, mixture)	Compute the log probability of the means for a specific mixture.
`sample`()

log_prob(means)¶

Compute the log prior probability of the means of a GMM. Since this will be used for monte carlo simulations, we care only that it is proportional to the true probability. For a uniform distribution we can use any value.

Parameters:

means : 2-D array_like, of shape (n_mixtures, n_features)

mean vectors for the GMM

Returns:

: double

Proportional to the log probability of the means of the GMM. 0.0 if the means are within the bounds of the uniform prior, -inf otherwise.

log_prob_single(mean, mixture)¶

Compute the log probability of the means for a specific mixture.

Parameters:

mean : 1-D array_like of length n_features

Single mean vector from a single mixture of the GMM.

mixture_num : int

Index of the mixture for the mean.

Returns:

: double

Proportional to the log prior probability for means 0.0 if the means are within the bounds of the uniform prior, -inf otherwise.

sample()¶

class gmmmc.priors.prior.WeightsDirichletPrior(alpha)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(weights)	Calculate log probability of weight vector according to dirichlet prior.
`log_prob_single`(weights, mixture_num)	Identical to log_prob
`sample`()	Sample from dirichlet distribution.

log_prob(weights)¶

Calculate log probability of weight vector according to dirichlet prior.

Parameters:

weights : 1-D array_like of shape (n_mixtures)

Returns:

: double

log probability under distribution

log_prob_single(weights, mixture_num)¶

Identical to log_prob

Parameters:

weights : 1-D array_like of shape (n_mixtures)

mixture_num : unused

Returns:

: double

log probability under distribution

sample()¶

Sample from dirichlet distribution.

Returns:

: 1-D array like of shape (n_mixtures)

Sample weights from dirichlet distribution.

class gmmmc.priors.prior.WeightsStaticPrior(prior_weights)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(weights)	Returns a log probability assuming weights have a true fixed value.
`log_prob_single`(weights, mixture_num)	Functionally the same as log_prob
`sample`()	Sample true weight vector

log_prob(weights)¶

Returns a log probability assuming weights have a true fixed value.

Parameters:

weights : 1-D array_like of shape (n_mixtures)

Weight vector for mixture. Must lie on the normal simplex.

Returns:

: double

0 if weights are close to true values, -inf otherwise

log_prob_single(weights, mixture_num)¶

Functionally the same as log_prob

Parameters:

weights : 1-D array_like of shape (n_mixtures)

Weight vector for mixture. Must lie on the normal simplex.

mixture_num : int

Not used.

Returns:

: double

0 if weights are close to true values, -inf otherwise

sample()¶

Sample true weight vector

Returns:

: 1-D array_like with shape (n_mixtures)

true weights.

class gmmmc.priors.prior.WeightsUniformPrior(n_mixtures)¶

Bases: gmmmc.priors.prior.GMMParameterPrior

Methods

`log_prob`(weights)	Returns a log probability according to uniform dirichlet prior.
`log_prob_single`(weights, mixture_num)	Functionally the same as log_prob
`sample`()	Draw sample from dirichlet distribution

log_prob(weights)¶

Returns a log probability according to uniform dirichlet prior.

Parameters:

weights : 1-D array_like of shape (n_mixtures)

Weight vector for mixture. Must lie on the normal simplex.

Returns:

: double

log probability under uniform prior.

log_prob_single(weights, mixture_num)¶

Functionally the same as log_prob

Parameters:

weights : 1-D array_like of shape (n_mixtures)

Weight vector for mixture. Must lie on the normal simplex.

mixture_num : not used in this context

Returns:

: double

log probability under uniform prior.

sample()¶

Draw sample from dirichlet distribution

Returns:

: 1-D array_like of shape (n_mixtures)

Set of weight parameters for a GMM.

GMM Representation¶

class gmmmc.gmm.GMM(means, covariances, weights)¶

Attributes

`covars`
`means`
`weights`

Methods

`log_likelihood`(X[, n_jobs])	Calculate the average log likelihood of the data given the GMM parameters
`sample`(n_samples)	Sample from the GMM.

covars¶

log_likelihood(X, n_jobs=1)¶

Calculate the average log likelihood of the data given the GMM parameters

Parameters:

X : 2-D array_like of shape (n_samples, n_features)

Data to be used.

n_jobs : int

Number of CPU cores to use in the calculation

Returns:

: float

average log likelihood of the data given the GMM parameters

Notes

For GMMs with small numbers of mixtures (<10) the use of more than 1 core can slow down the function.

means¶

sample(n_samples)¶

Sample from the GMM.

Parameters:

n_samples : int

Number of samples to draw.

Returns:

: 2-D array_like of shape (n_samples, n_features)

Samples drawn from the GMM distribution

weights¶

Monte Carlo Algorithms¶

class gmmmc.monte_carlo.AnnealedImportanceSampling(proposal, priors, betas)¶

Bases: gmmmc.monte_carlo.MonteCarloBase

Methods

`anneal`(X, n_jobs[, diagnostics])	A single annealing run from AIS.
`sample`(X, n_samples[, n_jobs, diagnostics])	Generate samples from the posterior distribution of the parameters.

anneal(X, n_jobs, diagnostics=None)¶

A single annealing run from AIS.

Parameters:

X : 2-D array_like of shape (n_feature_vectors, n_features)

Feature vectors used as the data for the underlying model.

n_jobs : int

Number of cpu cores to utilise during the Monte Carlo simulation.

diagnostics : empty dictionary, optional

If included, the dictionary passed to the function will contain diagnostic information from each annealing run of AIS. TODO: expand on this

Returns:

: tuple (GMM Object, double)

A single GMM sample from AIS and its corresponding weight.

sample(X, n_samples, n_jobs=1, diagnostics=None)¶

Generate samples from the posterior distribution of the parameters.

Parameters:

X : 2-D array_like of shape (n_feature_vectors, n_features)

Feature vectors used as the data for the underlying model.

n_samples : int

Number of Monte Carlo samples to be drawn from the posterior distribution.

n_jobs : int

Number of cpu cores to utilise during the Monte Carlo simulation.

diagnostics : empty dictionary, optional

If included, the dictionary passed to the function will contain diagnostic information from each annealing run of AIS. TODO: expand on this

Returns:

: list of tuples (GMM, double)

A list of GMM samples with their corresponding weight.

class gmmmc.monte_carlo.MarkovChain(proposal, prior, initial_gmm)¶

Bases: gmmmc.monte_carlo.MonteCarloBase

Methods

sample(X, n_samples[, n_jobs]) Sample from the posterior distribution of the GMM parameters

sample(X, n_samples, n_jobs=1)¶

Sample from the posterior distribution of the GMM parameters

Parameters:

X : 2-D array_like of shape (n_feature_vectors, n_features)

Feature vectors used as the data for the underlying model.

n_samples : int

Number of Monte Carlo samples to be drawn from the posterior distribution.

n_jobs : int

Number of cpu cores to utilise during the Monte Carlo simulation.

Returns:

samples : List of GMM Objects

Returns a list of GMMs (GMM parameters) which are the samples drawn from the distribution.

Notes

It is beneficial to discard a number of samples from the beginning of the chain (burn-in) as well as to use only every nth sample (lag) to reduce the correlation between succesive samples of the posterior.

class gmmmc.monte_carlo.MonteCarloBase¶

Bases: object

Methods

sample(X, n_samples)

sample(X, n_samples)¶

Target Distributions¶

class gmmmc.posterior.GMMPosteriorTarget(prior, beta=1)¶

Posterior distribution (targets distribution)

Methods

log_prob(X, gmm, n_jobs)

Parameters:

log_prob(X, gmm, n_jobs)¶

Parameters:

X : 2-D array_like of shape (n_samples, n_features)

Feature vectors

gmm : GMM object

GMM parameters for the calculation of the prior probability

n_jobs : int

Number of cores to use in the calculation.

Returns : double

log probability of the posterior up to a constant factor.

——-