API Reference

The functions documented in this reference are primarily in generate.h. It will be helpful to review the main Example for an overview of the interface.

Defining A Target Distribution

To generate random variates, librvg requires a specification of the target as a cumulative distribution function (CDF) and/or survival function (SF). These functions must adhere to the following type.

typedef float (*cdf32_t)(double x);

A function type that takes as input a double x and returns a float.

Use for a Cumulative Distribution Function

When used as a CDF, the function returns a float representing \(\Pr(X \le x)\) for the generated random variate \(X\). To be a valid specification, F must satisfy the following properties:

For all floats x, F(x) is a float in the unit interval [0, 1].
F(x) = 1, whenever x is NaN (i.e., if x != x).
F is nondecreasing, i.e., if x’ < x then F(x’) <= F(x).

Use for a Survival Function

When used as a SF, the function returns a float representing \(\Pr(X > x)\) for the generated random variate \(X\). To be a valid specification, S must satisfy the following properties:

For all floats x, S(x) is a float in the unit interval [0, 1].
S(x) = 0, whenever x is NaN (i.e., if x != x).
S is nonincreasing, i.e., if x’ < x then S(x) <= S(x’).

Note

According to the above specification, probability distributions are permitted to include atoms at special floating-point values -INFINITY, INFINITY, and NAN. Using a CDF F, this behavior can be achieved as follows.

Atom at -INFINITY: let F(-INFINITY) > 0.

Atom at -INFINITY: let F(nextafterf(INFINITY, 0)) < F(INFINITY).

Atom at NAN: let F(-INFINITY) < 1.

It is also possible to specify separate atoms at the negative and positive zeros (0+, 0-) of floating-point.

An advanced feature of librvg is the notion of a “dual distribution function” (DDF), which combines a numerical CDF and SF to enable extended accuracy sampling. This representation is described in the Guarantees and in [SL25, §6]. The type of a DDF is given in as

typedef void (*ddf32_t)(double x, bool *b, float *p);

This function takes as input a double x, and writes the result of evaluating a DDF at x to the pointers b and p. The interpretation of b and p are as follows:

If b is set to 0, then \(\Pr(X \le x) = p\)
If b is set to 1, then \(\Pr(X \le x) = 1 - p\)

Functions that operate using a DDF have an _ext suffix (meaning “extended”).

Wrapping a Distribution

librvg can interporate with existing CDF or SF implementations from the GNU Scientific Library. The following examples show how to wrap a Normal(0,1) and Poisson(5) distributions to create new functions that are compatible with with cdf32_t and ddf32_t.

// Gaussian distribution from the GNU library (continuous).
MAKE_CDF_P(gaussian_cdf, gsl_cdf_gaussian_P, 1);
MAKE_CDF_Q(gaussian_sf, gsl_cdf_gaussian_Q, 1);
MAKE_DDF(gaussian_ddf, gaussian_cdf, gaussian_sf);

// EXAMPLE 2: Poisson distribution from the GNU library (discrete).
MAKE_CDF_UINT_P(poisson_cdf, gsl_cdf_poisson_P, 5);
MAKE_CDF_UINT_Q(poisson_sf, gsl_cdf_poisson_Q, 5);
MAKE_DDF(poisson_ddf, poisson_cdf, poisson_sf);

The following macros are available, where name should be a fresh C identifier (not a string) for the name of the new function created from func. The suffixes _P and _Q correspond to a CDF and SF, respectively, a naming convention inherited from the GSL. The ... varargs are passed directly to func. The MAKE_DDF() macro is not specific to a CDF or SF created from the GSL, it can be used for any such pairing.

MAKE_CDF_P(name, func, ...): Make a cumulative distribution over doubles from the GSL.

MAKE_CDF_Q(name, func, ...): Make a survival distribution over doubles from the GSL.

MAKE_CDF_UINT_P(name, func, ...): Make a cumulative distribution over unsigned integers from the GSL.

MAKE_CDF_UINT_Q(name, func, ...): Make a survival distribution over unsigned integers from the GSL.

MAKE_DDF(name, func_cdf, func_sf): Make a dual distribution function.

float cdf_discrete(double x, const float *P, size_t K);: A cumulative distribution function (cdf32_t) for arbitrary discrete distributions over nonnegative integers. It is parameterized by a length-K input array P of cumulative probabilities, where P[i] is the cumulative probability of integer i. Available in discrete.h.

Generating Random Variates

Entropy-Optimal Generation

The generators are the entropy-optimal algorithms from [SL25]. They take as input the target cdf as described in the previous section as well as a prng, which is described in Pseudorandom Number Generators.

double generate_opt(cdf32_t cdf, struct flip_state *prng): Generate random variables optimally from cdf.

double generate_opt_ext(ddf32_t ddf, struct flip_state *prng): Generate random variables optimally from ddf.

Conditional-Bit Generation

librvg also includes implementations of the Conditional Bit Sampling (CBS) method. These algorithms are described in Sobolewski and Payne [SP72, Section II]. The performance of these functions is generally inferior to that of generate_opt() and generate_opt_ext() in terms of entropy consumption and runtime, as they use arbitrary precision arithmetic via the GNU MP Bignum Library (libgmp).

double generate_cbs(cdf32_t cdf, struct flip_state *prng): Generate random variables from cdf using Conditional Bit Sampling.

double generate_cbs_ext(ddf32_t ddf, struct flip_state *prng): Generate random variables from ddf using Conditional Bit Sampling.

Querying a CDF

double quantile(cdf32_t cdf, float q)

double quantile_sf(cdf32_t sf, float q)

double quantile_ext(ddf32_t ddf, bool d, float q)

These functions are used to compute the exact quantiles of a CDF, SF, or DDF. Recall that the quantile \(Q\) of a cumulative distribution function \(F\) is its is a generalized inverse:

\[\begin{aligned} Q(u) = \inf\{ x \in \mathbb{R} \mid u \le F(x) \} && (u \in [0,1]). \end{aligned}\]

For a numerical function \(\texttt{cdf}: \mathbb{F} \to \mathbb{F} \cap [0,1]\), the exact quantile is

\[\begin{aligned} \texttt{quantile}(u) = \min\{ f \in \mathbb{F} \mid f \le \texttt{cdf}(x) \} && (u \in \mathbb{F} \cap [0,1]). \end{aligned}\]

void bounds_quantile(cdf32_t cdf, double *xlo, double *xhi);
void bounds_quantile_sf(cdf32_t sf, double *xlo, double *xhi);
void bounds_quantile_ext(ddf32_t ddf, double *xlo, double *xhi);: These functions compute the values of the smallest and largest atoms of a distribution. The results are stored in the output parameters xlo and xhi.

Pseudorandom Number Generators

Available in prng.h and flip.h

Flip States

The generators described above take as input a pseudorandom number generator called prng, which must be obtained using make_flip_state(). The usual pattern is to make a flip state using a gsl_rng pointer from the GNU Scientific Library, as follows.

// Prepare the random number generator.
gsl_rng * rng = gsl_rng_alloc(gsl_rng_default);
struct flip_state prng = make_flip_state(rng);

struct flip_state: A struct that maintains the state of a sequence of flips.

struct flip_state make_flip_state(gsl_rng *rng): Initialize a flip_state using an rng.

Random bits are obtained from a flip_state using the following functions.

unsigned char flip(struct flip_state *s): Generate a single bit from a flip_state.

unsigned long long flip_k(struct flip_state *s, int k): Generate a random k-bit number from a flip_state.

unsigned long long randint(struct flip_state *s, int k): Generate a random k-bit number from a flip_state.

Additional PRNGs

There are a large number of pseudorandom number generators in the GSL, which can be used out of the box. librvg also provides two additional PRNG types.

extern const gsl_rng_type *gsl_rng_urandom: This generator draws system-level entropy using the getrandom syscall, which is typically /dev/urandom. It maintains no state itself and cannot be seeded by the user, so calls such as gsl_rng_set have no effect. Used primarily for workflows that require cryptographically secure random bits, see [FederalOfISecurity23] for a detailed description of the system entropy source.

extern const gsl_rng_type *gsl_rng_deterministic: This generator deterministically returns its seed. Its state consists of a single unsigned long int value (typically 32 bits) which is always returned. It is useful for debugging and characterizing the properties of generators.