AutoGP API
The purpose of this package is to automatically infer Gaussian process (GP) models of time series data, hence the name AutoGP.
At a high level, a Gaussian process $\{X(t) \mid t \in T \}$ is a family of random variables indexed by some set $T$. For time series data, the index set $T = \mathbb{R}$ is typically (a subset) of the real numbers. For any list of $n$ time points $[t_1, \dots, t_n]$, the prior distribution of the random vector $[X(t_1), \dots, X(t_n)]$ is a multivariate Gaussian,
\[\begin{aligned} \begin{bmatrix} X(t_1)\\ \vdots \\ X(t_n)\\ \end{bmatrix} \sim \mathrm{MultivariteNormal} \left( \begin{bmatrix} m(t_1)\\ \vdots \\ m(t_n)\\ \end{bmatrix}, \begin{bmatrix} k_\theta(t_1, t_1) & \dots & k_\theta(t_1, t_n) \\ \vdots & \ddots & \vdots \\ k_\theta(t_n, t_1) & \dots & k_\theta(t_n, t_n) \\ \end{bmatrix} \right). \end{aligned}\]
In the equation above $m : T \to \mathbb{R}$ is the mean function and $k_\theta: T \times T \to \mathbb{R}_{\ge 0}$ is the covariance (kernel) function parameterized by $\theta$. We typically assume (without loss of generality) that $m(t) = 0$ for all $t$ and focus our modeling efforts on the covariance kernel $k_\theta$. The structure of the kernel dictates the qualitative properties of $X$, such as the existence of linear trends, seasonal components, changepoints, etc.; refer to the kernel cookbook for an overview of these concepts.
Gaussian processes can be used as priors in Bayesian nonparametric models of time series data $[Y(t_1), \dots, Y(t_n)]$ as follows. We assume that each $Y(t_i) = g(X(t_i), \epsilon(t_i))$ for some known function $g$ and noise process $\epsilon(t_i)$. A common choice is to let $Y$ be a copy of $X$ corrupted with i.i.d. Gaussian innovations, which gives $Y(t_i) = X(t_i) + \epsilon(t_i)$ where $\epsilon(t_i) \sim \mathrm{Normal}(0, \eta)$ for some variance $\eta > 0$.
Writing out the full model
\[\begin{aligned} k &\sim \mathrm{PCFG} &&\textrm{(prior over covariance kernel)} \\ [\theta_1, \dots, \theta_{d(k)}] &\sim p_{k} && \textrm{(parameter prior)}\\ \eta &\sim p_{\eta} &&\textrm{(noise prior)} \\ [X(t_1), \dots, X(t_n)] &\sim \mathrm{MultivariteNormal}(\mathbf{0}, [k_\theta(t_i,t_j)]_{i,j=1}^n) &&\textrm{(Gaussian process)} \\ Y(t_i) &\sim \mathrm{Normal}(X(t_i), \eta), i=1,\dots,n &&\textrm{(noisy observations)} \end{aligned}\]
where PCFG denotes a probabilistic-context free grammar that defines a language of covariance kernel expressions $k$,
\[\begin{aligned} B &:= \texttt{Linear} \mid \texttt{Periodic} \mid \texttt{GammaExponential} \mid \dots \\ \oplus &:= \texttt{+} \mid \texttt{*} \mid \texttt{ChangePoint} \\ k &:= B \mid \texttt{(}k_1 \oplus k_2\texttt{)}. \end{aligned}\]
and $d(k)$ is the number of parameters in expression $k$. Given data $\{(t_i,y_i)\}_{i=1}^n$, AutoGP infers likely values of the symbolic structure of the covariance kernel $k$, the kernel parameters $\theta$, and the observation noise variance $\eta$.
The ability to automatically synthesize covariance kernels in AutoGP is in contrast to existing Gaussian process libraries such as GaussianProcess.jl sklearn.gaussian_process, GPy, and GPyTorch, which all require users to manually specify $k$.
After model fitting, users can query the models to generate forecasts, compute probabilities, and inspect qualitative structure.
Model Initialization
AutoGP.GPModel — Type
struct GPModelA GPModel contains covariance kernel structures and parameters for modeling data.
Fields
pf_state::Gen.ParticleFilterState: Internal particle set.config::GP.GPConfig=GP.GPConfig(): User-specific customization, refer toGP.GPConfig.ds::IndexType: Observed time points.y::Vector{<:Real}: Observed time series values.ds_transform::Transforms.LinearTransform: Transformation of time to direct space.y_transform::Transforms.LinearTransform: Transformation of observations to direct space.
Constructors
model = GPModel(
ds::IndexType,
y::Vector{<:Real};
n_particles::Integer=8,
config::GP.GPConfig=GP.GPConfig())See also
To perform learning given the data, refer to
sourceAutoGP.num_particles — Function
AutoGP.seed! — Function
AutoGP.IndexType — Type
IndexType = Union{Vector{<:Real}, Vector{<:Date}, Vector{<:DateTime}}Permitted Julia types for Gaussian process time points. Real numbers are ingested directly, treated as time points. Instances of the Dates types are converted to numeric time points by using Dates.datetime2unix.
End-to-End Model Fitting
AutoGP.fit_smc! — Function
fit_smc!(
model::GPModel;
schedule::Vector{<:Integer},
n_mcmc::Int,
n_hmc::Int,
shuffle::Bool=true,
biased::Bool=false,
adaptive_resampling::Bool=true,
adaptive_rejuvenation::Bool=false,
hmc_config::Dict=Dict(),
verbose::Bool=false,
check::Bool=false,
callback_fn::Function=(; kwargs...) -> nothing)Infer the structure and parameters of an appropriate Gaussian process covariance kernel for modeling the observed data. Inference is performed using sequential Monte Carlo.
Arguments
model::GPModel: Instance of theGPModelto use.schedule::Vector{<:Integer}: Schedule for incorporating data for SMC, refer toSchedule.n_mcmc::Union{Integer,Vector{<:Integer}}: Number of involutive MCMC rejuvenation steps. If vector, must have same length asschedule.n_hmc::Union{Integer,Vector{<:Integer}}: Number of HMC steps per accepted involutive MCMC step. If vector, must have same length asschedule.biased::Bool: Whether to bias the proposal to produce "short" structures.shuffle::Bool=true: Whether to shuffle indexesdsor incorporate data in the given order.adaptive_resampling::Bool=true: Iftrueresamples based on ESS threshold, else at each step.adaptive_rejuvenation::Bool=false: Iftruerejuvenates only if resampled, else at each step.hmc_config::Dict: Configuration for HMC inference on numeric parameters. Allowable keys are:n_exit::Integer=1: Number of successive rejections after which HMC loop is terminated.L_param::Integer=10Number of leapfrog steps for kernel parameters.L_noise::Integer=10: Number of leapfrog steps for noise parameter.eps_param::Float64=0.02: Step size for kernel parameters.eps_noise::Float64=0.02: Step size for noise parameter.
verbose::Bool=false: Report progress to stdout.check::Bool=false: Perform dynamic correctness checks during inference.config::GP.GPConfig=GP.GPConfig(): User-specific customization, refer toGP.GPConfig.callback_fn: A callback for monitoring inference, must be generated byAutoGP.Callbacks.make_smc_callback.
AutoGP.fit_mcmc! — Function
fit_mcmc!(
model::GPModel;
n_mcmc::Integer,
n_hmc::Integer,
biased::Bool=false,
verbose::Bool=false,
check::Bool=false,
callback_fn::Function=(; kwargs...) -> nothing)Perform n_mcmc steps of involutive MCMC on the structure, with n_hmc steps of Hamiltonian Monte Carlo sampling on the parameters per accepted involutive MCMC move.
A callback_fn can be provided to monitor the progress each MCMC step for which at least one particle (i.e, chain) accepted a transition. Its signature must contain a single varargs specifier, which will be populated with keys :model, :step, :elapsed.
The callback_fn imposes a roughly 2x runtime overhead as compared to the equivalent mcmc_structure! method, because parallel execution must be synchronized across the particles to invoke the callback at each step. The :elapsed variable provided to the callback function will still reflect an accurate estimate of the inference runtime without this overhead. If no callback is required, use mcmc_structure! instead.
AutoGP.fit_greedy! — Function
function fit_greedy!(
model::GPModel;
max_depth::Integer=model.config.max_depth,
verbose::Bool=false,
check::Bool=false,
callback_fn::Function=(; kwargs...) -> nothing)Infer the structure and parameters of an appropriate Gaussian process covariance kernel for modeling the observed data. Inference is performed using greedy search, as described in Algorithm 2 of Kim and Teh, 2018. It is an error if max_depth is not a finite positive number.
A callback_fn can be provided to monitor the search progress at each stage. Its signature must contain a single varargs specifier, which will be populated with keys :model, :step, :elapsed at each step of the greedy search.
Incremental Model Fitting
AutoGP.add_data! — Function
add_data!(model::GPModel, ds::IndexType, y::Vector{<:Real})Incorporate new observations (ds, y) into model.
AutoGP.remove_data! — Function
remove_data!(model::GPModel, ds::IndexType, y::Vector{<:Real})Remove existing observations ds from model.
AutoGP.maybe_resample! — Function
maybe_resample!(model::GPModel, ess_threshold::Real)Resample the particle collection in model if ESS is below ess_threshold. Setting ess_threshold = AutoGP.num_particles(model) + 1 will ensure that resampling always takes place, since the ESS is upper bounded by the number of particles.
AutoGP.mcmc_structure! — Function
mcmc_structure!(model::GPModel, n_mcmc::Integer, n_hmc::Integer;
biased::Bool=false, verbose::Bool=false, check::Bool=false)Perform n_mcmc steps of involutive MCMC on the structure, with n_hmc steps of Hamiltonian Monte Carlo sampling on the parameters per accepted involutive MCMC move.
AutoGP.mcmc_parameters! — Function
mcmc_parameters!(model::GPModel, n_hmc::Integer; verbose::Bool=false, check::Bool=false)Perform n_hmc steps of Hamiltonian Monte Carlo sampling on the parameters.
Model Querying
AutoGP.predict — Function
predictions = predict(
model::GPModel,
ds::IndexType;
quantiles::Vector{Float64}=Float64[],
noise_pred::Union{Nothing,Float64}=nothing)Return predictions for new index point ds, and optionally quantiles corresponding to the provided quantiles (numbers between 0 and 1, inclusive). By default, the noise_pred of the new data is equal to the inferred noise of the observed data within each particle in model; using noise_pred=0. returns the posterior distribution over the noiseless function values.
The returned DataFrames.DataFrame has columns ["ds", "particle", "weight", "y_mean"], as well as any additional columns for the requested quantiles.
Example
julia> ds = [Dates.Date(2020,1,1), Dates.Date(2020,1,2)] # Dates to query
julia> GPModel.predict(model, ds; quantiles=[.025, 0.975])
16×6 DataFrame
Row │ ds particle weight y_0.025 y_0.975 y_mean
│ Date Int64 Float64 Float64 Float64 Float64
─────┼────────────────────────────────────────────────────────────────────
1 │ 2020-01-01 1 4.97761e-22 -13510.0 14070.6 280.299
2 │ 2020-01-02 1 4.97761e-22 -13511.0 14071.6 280.299
3 │ 2020-01-01 2 0.279887 4504.73 8211.43 6358.08
4 │ 2020-01-02 2 0.279887 4448.06 8154.3 6301.18
5 │ 2020-01-01 3 0.0748059 -43638.6 65083.0 10722.2
6 │ 2020-01-02 3 0.0748059 -43662.0 65074.7 10706.4
7 │ 2020-01-01 4 0.60809 -17582.2 30762.4 6590.06
8 │ 2020-01-02 4 0.60809 -17588.0 30771.5 6591.78sourceAutoGP.predict_proba — Function
function predict_proba(model::GPModel, ds::IndexType, y::Vector{<:Real})Compute predictive probability of data y at time points ds under model.
Example
julia> ds = [Dates.Date(2020,1,1), Dates.Date(2020,1,2)] # Dates to query
julia> y = [0.1, .0.5] # values to query
julia> GPModel.predict(model, ds, y)
7×3 DataFrame
Row │ particle weight logp
│ Int64 Float64 Float64
─────┼─────────────────────────────────
1 │ 1 0.0287155 -64.2388
2 │ 2 0.0437349 -59.7672
3 │ 3 0.576247 -62.6499
4 │ 4 0.00164846 -59.5311
5 │ 5 0.215255 -61.066
6 │ 6 0.134198 -64.4041
7 │ 7 0.000201078 -68.462sourceAutoGP.predict_mvn — Function
dist = predict_mvn(model::GPModel, ds::IndexType; noise_pred::Union{Nothing,Float64}=nothing)Return an instance of Distributions.MixtureModel representing the overall posterior predictive distribution for data at index points ds. By default, the noise_pred of the new data is equal to the inferred noise of the observed data within each particle in model; using noise_pred=0. returns the posterior distribution over the noiseless function values.
The returned dist has precisely num_particles(model) components, each of type Distributions.MvNormal, with weights particle_weights(model). These objects can be retrieved using Distributions.components and Distributions.probs, respectively.
AutoGP.predict_quantile — Function
(x::Vector, success::Bool) = predict_quantile(
model::GPModel, ds::IndexType, q::Real;
noise_pred::Union{Nothing,Float64}=nothing, tol=1e-5, max_iter=1e6)Evaluates the inverse cumulative distribution function (CDF) of the multivariate Gaussian mixture model returned by predict_mvn at q (between 0 and 1, exclusive) separately for each dimension. The returned vector x has the same length as the index points ds.
Note
The inverse CDF is numerically estimated using a binary search algorithm. The keyword arguments tol and max_iter correspond to the desired absolute tolerance of the estimate and the maximum number of binary search iterations, respectively. The returned Boolean variable success indicates whether the returned value x has been located to within the specified error tolerance.
See also
sourceAutoGP.log_marginal_likelihood_estimate — Function
log_marginal_likelihood_estimate(model::GPModel)Return estimate of marginal likelihood of data (in log space).
sourceAutoGP.particle_weights — Function
AutoGP.effective_sample_size — Function
effective_sample_size(model::GPModel)Return effective sample size (ESS) of weighted particle collection.
sourceAutoGP.covariance_kernels — Function
covariance_kernels(model::GPModel; reparameterize::Bool=true)Return Gaussian process covariance kernels in model. If reparameterize is true (default), then the kernel parameters are given in the original data space (more interpretable); otherwise they are given in the transformed space over which parameter inference is performed (useful for debugging).
AutoGP.observation_noise_variances — Function
observation_noise_variances(model::GPModel; reparameterize::Bool=true)Return list of observation noise variances for each particle in model. If reparameterize is true (default), then the kernel parameters are given in the original data space (more interpretable); otherwise they are given in the transformed space over which parameter inference is performed (useful for debugging).
AutoGP.decompose — Function
function decompose(model::GPModel)Decompose each particle within model into its constituent kernels. Supposing that num_particles(model) equals $k$, the return value models::Vector{GPModel} of decompose is a length-$k$ vector of GPModel instances.
Therefore, models[i] is a GPModel that represents the decomposition of particle i in model into its constituent kernels. Each particle in models[i] corresponds to a kernel fragment in the covariance for particle i of model (i.e., one particle for each GP.Node in the covariance kernel).
The weights of models[i] are arbitrary and have no meaningful value.
This function is particularly useful for visualizing the individual time series structures that make up each particle of model.
See also
sourceAutoGP.extract_kernel — Function
extract_kernel(model::GPModel, ::Type{T}; retain::Bool=true) where T <: GP.LeafNodeRetain all primitive kernels of type T <:AutoGP.GP.LeafNode from each particle, while erasing the others. In particular, the erasure rules are as follows, where G != T:
AutoGP.GP.Plus:k1::T + k2::Gbecomesk1 + Constant(0).AutoGP.GP.Times:k1::T * k2::Gbecomesk1 * Constant(1).AutoGP.GP.ChangePoint:CP(k1::T, k2::G)becomesCP(k1, Constant(0)).
If all the primitive kernels have type T, then the returned kernel is GP.Constant(0).
If retain=false, then the above behavior is flipped: each primitive kernel of type T is erased while all other primitive kernels are retained.
See also
sourceAutoGP.split_kernel_sop — Function
split_kernel_sop(model::GPModel, ::Type{T}) where {T <: GP.LeafNode}Decompose the kernels in model through a sum-of-products interpretation. Each kernel k in model is split as k1 + k2, where k1 contains all the factors with subkernel of type T and the rest of the terms are in k2. The sentinel addend of Constant(0) is used whenever k1 or k2 is empty.
The return value is a tuple (model_a::GPModel, model_b::GPModel) where model_a contains all the k1 kernels and model_b contains all the k2 kernels.
See also
sourceAutoGP.predict_sum — Function
predictions = predict_sum(
model::GPModel,
ds::IndexType,
::Type{T};
quantiles::Vector{Float64}=Float64[],
noise_pred::Union{Nothing,Float64}=nothing) where {T <: GP.LeafNode}Same as predict, except with an additional required parameter T <:AutoGP.GP.LeafNode that specifies which base kernel to use to split the predictions according to a sum-of-products decomposition.
See AutoGP.GP.split_kernel_sop for details on the SOP decomposition of a composite kernel.
The "component" column in the returned data frame takes three values for each time point and particle in the ensemble: 0 for the overall prediction, 1 for the subkernel containing base kernel type T, and 2 for the remaining kernel fragment.
Example
Row │ ds y_mean component particle weight y_0.025 y_0.975
│ Float64 Float64 Integer Integer Float64 Float64 Float64
─────┼───────────────────────────────────────────────────────────────────────
1 │ 3.0 1.50602 0 1 0.270087 -3.74135 6.75339
2 │ 4.0 1.51511 0 1 0.270087 -12.6729 15.7031
3 │ 3.0 1.50602 1 1 0.270087 -3.19708 6.20912
4 │ 4.0 1.51511 1 1 0.270087 -12.4808 15.511
5 │ 3.0 1.5 2 1 0.270087 1.4998 1.5002
6 │ 4.0 1.5 2 1 0.270087 1.4998 1.5002
7 │ 3.0 1.52347 0 2 0.729913 0.436285 2.61066
8 │ 4.0 1.5027 0 2 0.729913 0.414135 2.59126
9 │ 3.0 1.5 1 2 0.729913 1.4998 1.5002
10 │ 4.0 1.5 1 2 0.729913 1.4998 1.5002
11 │ 3.0 1.52347 2 2 0.729913 0.671908 2.37504
12 │ 4.0 1.5027 2 2 0.729913 0.649378 2.35601See also
sourceAutoGP.predict_mvn_sum — Function
dist = predict_mvn_sum(
model::GPModel,
ds::IndexType,
::Type{T};
noise_pred::Union{Nothing,Float64}=nothing) where {T <: GP.LeafNode}Same as predict_mvn, except with an additional required parameter T <:AutoGP.GP.LeafNode that specifies which base kernel to use to split the predictions according to a sum-of-products decomposition.
See AutoGP.GP.split_kernel_sop and AutoGP.GP.infer_gp_sum for details on the SOP decomposition of a composite kernel.
The function returns a pair (mvn, indexes) where mvn is an instances of Distributions.MvNormal.
If ds has length $n$, then each component of the returned mvn has dimension $3n$ for the mean and dimension $3n \times 3n$ for the covariance kernel representing the joint distribution of the observable data and the two latent GPs.
The indexes return value specifies how to extract dimensions for the sum and the two latent components: it is is a named tuple with fields
indexes.Y: indexes of the overall prediction.indexes.F[1]: indexes of prediction for kernels in the decomposition that include a base kernel of typeT.indexes.F[2]: indexes of prediction for kernels in the decomposition that do not include a base kernel of typeT.
See also
sourceSerialization
Base.Dict — Method
Base.Dict(model::GPModel)Convert a GPModel into a dictionary that can be saved and loaded from disk, as shown in the following example.
Example
using AutoGP, Dates, Serialization
model = AutoGP.GPModel([Date("2025-01-01"), Date("2025-01-02")], [1.0, 2.0])
serialize("model.autogp", Dict(model))
loaded_model = AutoGP.GPModel(deserialize("model.autogp"))Users should not directly serialize GPModel instances, e.g.,
serialize("model.autogp", model)
model = deserialize("model.autogp")The first line will throw ArgumentError.