Gaussian Process Library

This section describes a library for Gaussian process time series models. A technical overview of key concepts can be found in the following references.

Roberts S, Osborne M, Ebden M, Reece S, Gibson N, Aigrain S. 2013. Gaussian processes for time-series modelling. Phil Trans R Soc A 371: 20110550. http://dx.doi.org/10.1098/rsta.2011.0550

Rasmussen C, Williams C. 2006. Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA. http://gaussianprocess.org/gpml/chapters/

Covariance Kernels

AutoGP.GP.BinaryOpNodeType
abstract type BinaryOpNode <: Node end

Abstract class for [composite covariance kernels](@ref (@ref gpcovkernel_comp).

source
Base.sizeFunction
Base.size(node::Node)
Base.size(node::LeafNode) = 1
Base.size(a::LeafNode, b::Node) = size(a) + size(b)

Return the total number of subexpressions in a Node, as defined above.

source
AutoGP.GP.eval_covFunction
eval_cov(node::Node, t1::Real, t2::Real)
eval_cov(node::Node, ts::Vector{Float64})

Evaluate the covariance function node at the given time indexes. The first form returns a Real number and the second form returns a covariance Matrix.

source
AutoGP.GP.compute_cov_matrix_vectorizedFunction
compute_cov_matrix_vectorized(node::Node, noise, ts)

Compute covariance matrix by evaluating node on all pair of ts. The noise is added to the diagonal of the covariance matrix, which means that if ts[i] == ts[j], then X[ts[i]] and Xs[ts[j]] are i.i.d. samples of the true function at ts[i] plus mean zero Gaussian noise.

source

Primitive Kernels

Notation. In this section, generic parameters (e.g., $\theta$, $\theta_1$, $\theta_2$), are used to denote fieldnames of the corresponding Julia structs in the same order as they appear in the constructors.

AutoGP.GP.WhiteNoiseType
WhiteNoise(value)

White noise covariance kernel.

\[k(t, t') = \mathbf{I}[t = t'] \theta\]

The random variables $X[t]$ and $X[t']$ are perfectly correlated whenever $t = t'$ and independent otherwise. This kernel cannot be used to represent the joint distribution of multiple i.i.d. measurements of $X[t]$, instead see compute_cov_matrix_vectorized.

source
AutoGP.GP.ConstantType
Constant(value)

Constant covariance kernel.

\[k(t,t') = \theta\]

Draws from this kernel are horizontal lines, where $\theta_1$ determines the variance of the constant value around the mean (typically zero).

source
AutoGP.GP.LinearType
Linear(intercept[, bias=1, amplitude=1])

Linear covariance kernel.

\[k(t, t') = \theta_2 + \theta_3 (t - \theta_1)(t'-\theta_1)\]

Draws from this kernel are sloped lines in the 2D plane. The time intercept is $\theta_1$. The variance around the time intercept is $\theta_2$. The scale factor, which dictates the slope, is $\theta_3$.

source
AutoGP.GP.SquaredExponentialType
SquaredExponential(lengthscale[, amplitude=1])

Squared Exponential covariance kernel.

\[k(t,t') = \theta_2 \exp\left(-1/2|t-t'|/\theta_2)^2 \right)\]

Draws from this kernel are smooth functions.

source
AutoGP.GP.GammaExponentialType
GammaExponential(lengthscale, gamma[, amplitude=1])

Gamma Exponential covariance kernel.

\[k(t,t') = \theta_3 \exp(-(|t-t'|/\theta_1)^{\theta_2})\]

Requires 0 < gamma <= 2. Recovers the SquaredExponential kernel when gamma = 2.

source
AutoGP.GP.PeriodicType
Periodic(lengthscale, period[, amplitude=1])

Periodic covariance kernel.

\[k(t,t') = \exp\left( (-2/\theta_1^2) \sin^2((\pi/\theta_2) |t-t'|) \right)\]

The lengthscale determines how smooth the periodic function is within each period. Heuristically, the periodic kernel can be understood as:

  1. Sampling $[X(t), t \in [0,p]] \sim \mathrm{GP}(0, \mathrm{SE}(\theta_1))$.
  2. Repeating this fragment for all intervals $[jp, (j+1)p], j \in \mathbb{Z}$.
source

Composite Kernels

AutoGP.GP.TimesType
Times(left::Node, right::Node)
Base.:*(left::Node, right::Node)

Covariance kernel obtained by multiplying two covariance kernels pointwise.

\[k(t,t') = k_{\rm left}(t,t') \times k_{\rm right}(t,t')\]

source
AutoGP.GP.PlusType
Plus(left::Node, right::Node)
Base.:+(left::Node, right::Node)

Covariance kernel obtained by summing two covariance kernels pointwise.

\[k(t,t') = k_{\rm left}(t,t') + k_{\rm right}(t,t')\]

source
AutoGP.GP.ChangePointType
ChangePoint(left::Node, right::Node, location::Real, scale::Real)

Covariance kernel obtained by switching between two kernels at location.

\[\begin{aligned} k(t,t') &= [\sigma_1 \cdot k_{\rm left}(t, t') \cdot \sigma_2] + [(1 - \sigma_1) \cdot k_{\rm right}(t, t') \cdot (1-\sigma_2)] \\ \mathrm{where}\, \sigma_1 &= (1 + \tanh((t - \theta_1) / \theta_2))/2, \\ \sigma_2 &= (1 + \tanh((t' - \theta_1) / \theta_2))/2. \end{aligned}\]

The location parameter $\theta_1$ denotes the time point at which the change occurs. The scale parameter $\theta_2$ is a nonnegative number that controls the rate of change; its behavior can be understood by analyzing the two extreme values:

  • If location=0 then $k_{\rm left}$ is active and $k_{\rm right}$ is inactive for all times less than location; $k_{\rm right}$ is active and $k_{\rm left}$ is inactive for all times greater than location; and $X[t] \perp X[t']$ for all $t$ and $t'$ on opposite sides of location.

  • If location=Inf then $k_{\rm left}$ and $k_{\rm right}$ have equal effect for all time points, and $k(t,t') = 1/2 (k_{\rm left}(t,'t) + k_{\rm right}(t,t'))$, which is equivalent to a Plus kernel scaled by a factor of $1/2$.

source

Prediction Utilities

Distributions.MvNormalType
dist = Distributions.MvNormal(
        node::Node,
        noise::Float64,
        ts::Vector{Float64},
        xs::Vector{Float64},
        ts_pred::Vector{Float64};
        noise_pred::Union{Nothing,Float64}=nothing)

Return MvNormal posterior predictive distribution over xs_pred at time indexes ts_pred, given noisy observations [ts, xs] and covariance function node with given level of observation noise.

By default, the observation noise (noise_pred) of the new data is equal to the noise of the observed data; use noise_pred = 0. to obtain the predictive distribution over noiseless future values.

See also

source
Statistics.quantileFunction
Distributions.quantile(dist::Distributions.MvNormal, p)

Compute quantiles of marginal distributions of dist.

Examples

Distributions.quantile(Distributions.MvNormal([0,1,2,3], LinearAlgebra.I(4)), .5)
Distributions.quantile(Distributions.MvNormal([0,1,2,3], LinearAlgebra.I(4)), [[.1, .5, .9]])
source

Prior Configuration

AutoGP.GP.GPConfigType
config = GPConfig(kwargs...)

Configuration of prior distribution over Gaussian process kernels, i.e., an instance of Node. The main kwargs (all optional) are:

  • node_dist_leaf::Vector{Real}: Prior distribution over LeafNode kernels; default is uniform.

  • node_dist_nocp::Vector{Real}: Prior distribution over BinaryOpNode kernels; only used if changepoints=false.

  • node_dist_cp::Vector{Real}: Prior distribution over BinaryOpNode kernels; only used if changepoints=true.

  • max_depth::Integer: Maximum depth of covariance node; default is -1 for unbounded.

  • changepoints::Bool: Whether to permit ChangePoint compositions; default is true.

  • noise::Union{Nothing,Float64}: Whether to use a fixed observation noise; default is nothing to infer automatically.

source