API Reference
The functions documented in this reference are primarily in
generate.h. It will be helpful to review the main
Example for an overview of the interface.
Defining A Target Distribution
To generate random variates, librvg requires a specification of the target as a cumulative distribution function (CDF) and/or survival function (SF). These functions must adhere to the following type.
-
typedef float (*cdf32_t)(double x);
A function type that takes as input a double x and returns a float.
- Use for a Cumulative Distribution Function
When used as a CDF, the function returns a float representing \(\Pr(X \le x)\) for the generated random variate \(X\). To be a valid specification, F must satisfy the following properties:
For all floats x, F(x) is a float in the unit interval [0, 1].
F(x) = 1, whenever x is NaN (i.e., if x != x).
F is nondecreasing, i.e., if x’ < x then F(x’) <= F(x).
- Use for a Survival Function
When used as a SF, the function returns a float representing \(\Pr(X > x)\) for the generated random variate \(X\). To be a valid specification, S must satisfy the following properties:
For all floats x, S(x) is a float in the unit interval [0, 1].
S(x) = 0, whenever x is NaN (i.e., if x != x).
S is nonincreasing, i.e., if x’ < x then S(x) <= S(x’).
Note
According to the above specification, probability distributions are permitted to include atoms at special floating-point values
-INFINITY,INFINITY, andNAN. Using a CDF F, this behavior can be achieved as follows.Atom at
-INFINITY: letF(-INFINITY) > 0.Atom at
-INFINITY: letF(nextafterf(INFINITY, 0)) < F(INFINITY).Atom at
NAN: letF(-INFINITY) < 1.
It is also possible to specify separate atoms at the negative and positive zeros (
0+,0-) of floating-point.
An advanced feature of librvg is the notion of a “dual distribution function” (DDF), which combines a numerical CDF and SF to enable extended accuracy sampling. This representation is described in the Guarantees and in [SL25, §6]. The type of a DDF is given in as
-
typedef void (*ddf32_t)(double x, bool *b, float *p);
This function takes as input a double
x, and writes the result of evaluating a DDF atxto the pointersbandp. The interpretation ofbandpare as follows:If
bis set to 0, then \(\Pr(X \le x) = p\)If
bis set to 1, then \(\Pr(X \le x) = 1 - p\)
Functions that operate using a DDF have an _ext suffix (meaning “extended”).
Wrapping a Distribution
librvg can interporate with existing CDF or SF implementations from the
GNU Scientific Library.
The following examples show how to wrap a Normal(0,1) and Poisson(5)
distributions to create new functions that are compatible with
with cdf32_t and ddf32_t.
// Gaussian distribution from the GNU library (continuous).
MAKE_CDF_P(gaussian_cdf, gsl_cdf_gaussian_P, 1);
MAKE_CDF_Q(gaussian_sf, gsl_cdf_gaussian_Q, 1);
MAKE_DDF(gaussian_ddf, gaussian_cdf, gaussian_sf);
// EXAMPLE 2: Poisson distribution from the GNU library (discrete).
MAKE_CDF_UINT_P(poisson_cdf, gsl_cdf_poisson_P, 5);
MAKE_CDF_UINT_Q(poisson_sf, gsl_cdf_poisson_Q, 5);
MAKE_DDF(poisson_ddf, poisson_cdf, poisson_sf);
The following macros are available, where
name should be a fresh C identifier (not a string)
for the name of the new function created from func. The suffixes
_P and _Q correspond to a CDF and SF, respectively, a naming
convention inherited from the GSL. The ... varargs are passed
directly to func. The MAKE_DDF() macro is not specific to a
CDF or SF created from the GSL, it can be used for any such pairing.
-
MAKE_CDF_P(name, func, ...)
Make a cumulative distribution over doubles from the GSL.
-
MAKE_CDF_Q(name, func, ...)
Make a survival distribution over doubles from the GSL.
-
MAKE_CDF_UINT_P(name, func, ...)
Make a cumulative distribution over unsigned integers from the GSL.
-
MAKE_CDF_UINT_Q(name, func, ...)
Make a survival distribution over unsigned integers from the GSL.
-
MAKE_DDF(name, func_cdf, func_sf)
Make a dual distribution function.
-
float cdf_discrete(double x, const float *P, size_t K);
A cumulative distribution function (
cdf32_t) for arbitrary discrete distributions over nonnegative integers. It is parameterized by a length-Kinput arrayPof cumulative probabilities, whereP[i]is the cumulative probability of integeri. Available indiscrete.h.
Generating Random Variates
Entropy-Optimal Generation
The generators are the entropy-optimal algorithms from [SL25].
They take as input the target cdf as described in the
previous section as well as a
prng, which is described in Pseudorandom Number Generators.
-
double generate_opt(cdf32_t cdf, struct flip_state *prng)
Generate random variables optimally from
cdf.
-
double generate_opt_ext(ddf32_t ddf, struct flip_state *prng)
Generate random variables optimally from
ddf.
Conditional-Bit Generation
librvg also includes implementations of the Conditional Bit Sampling (CBS)
method. These algorithms are described in Sobolewski and Payne [SP72, Section II].
The performance of these functions is generally inferior to that of
generate_opt() and generate_opt_ext() in terms
of entropy consumption and runtime, as they use arbitrary precision
arithmetic via the GNU MP Bignum Library (libgmp).
-
double generate_cbs(cdf32_t cdf, struct flip_state *prng)
Generate random variables from
cdfusing Conditional Bit Sampling.
-
double generate_cbs_ext(ddf32_t ddf, struct flip_state *prng)
Generate random variables from
ddfusing Conditional Bit Sampling.
Querying a CDF
-
double quantile(cdf32_t cdf, float q)
-
double quantile_sf(cdf32_t sf, float q)
-
double quantile_ext(ddf32_t ddf, bool d, float q)
These functions are used to compute the exact quantiles of a CDF, SF, or DDF. Recall that the quantile \(Q\) of a cumulative distribution function \(F\) is its is a generalized inverse:
\[\begin{aligned} Q(u) = \inf\{ x \in \mathbb{R} \mid u \le F(x) \} && (u \in [0,1]). \end{aligned}\]For a numerical function \(\texttt{cdf}: \mathbb{F} \to \mathbb{F} \cap [0,1]\), the exact quantile is
\[\begin{aligned} \texttt{quantile}(u) = \min\{ f \in \mathbb{F} \mid f \le \texttt{cdf}(x) \} && (u \in \mathbb{F} \cap [0,1]). \end{aligned}\]
-
void bounds_quantile(cdf32_t cdf, double *xlo, double *xhi);
-
void bounds_quantile_sf(cdf32_t sf, double *xlo, double *xhi);
-
void bounds_quantile_ext(ddf32_t ddf, double *xlo, double *xhi);
These functions compute the values of the smallest and largest atoms of a distribution. The results are stored in the output parameters
xloandxhi.
Pseudorandom Number Generators
Available in prng.h and flip.h
Flip States
The generators described above
take as input a pseudorandom number generator called prng, which
must be obtained using make_flip_state(). The usual pattern is to
make a flip state using a gsl_rng pointer from the
GNU Scientific Library,
as follows.
// Prepare the random number generator.
gsl_rng * rng = gsl_rng_alloc(gsl_rng_default);
struct flip_state prng = make_flip_state(rng);
-
struct flip_state
A struct that maintains the state of a sequence of flips.
-
struct flip_state make_flip_state(gsl_rng *rng)
Initialize a
flip_stateusing anrng.
Random bits are obtained from a flip_state using the following
functions.
-
unsigned char flip(struct flip_state *s)
Generate a single bit from a
flip_state.
-
unsigned long long flip_k(struct flip_state *s, int k)
Generate a random
k-bit number from aflip_state.
-
unsigned long long randint(struct flip_state *s, int k)
Generate a random
k-bit number from aflip_state.
Additional PRNGs
There are a large number of pseudorandom number generators in the GSL, which can be used out of the box. librvg also provides two additional PRNG types.
-
extern const gsl_rng_type *gsl_rng_urandom
This generator draws system-level entropy using the getrandom syscall, which is typically
/dev/urandom. It maintains no state itself and cannot be seeded by the user, so calls such asgsl_rng_sethave no effect. Used primarily for workflows that require cryptographically secure random bits, see [FederalOfISecurity23] for a detailed description of the system entropy source.
-
extern const gsl_rng_type *gsl_rng_deterministic
This generator deterministically returns its seed. Its state consists of a single
unsigned long intvalue (typically 32 bits) which is always returned. It is useful for debugging and characterizing the properties of generators.