API Reference
The functions documented in this reference are primarily in
generate.h
. It will be helpful to review the main
Example for an overview of the interface.
Defining A Target Distribution
To generate random variates, librvg requires a specification of the target as a cumulative distribution function (CDF) and/or survival function (SF). These functions must adhere to the following type.
-
typedef float (*cdf32_t)(double x);
A function type that takes as input a double x and returns a float.
- Use for a Cumulative Distribution Function
When used as a CDF, the function returns a float representing \(\Pr(X \le x)\) for the generated random variate \(X\). To be a valid specification, F must satisfy the following properties:
For all floats x, F(x) is a float in the unit interval [0, 1].
F(x) = 1, whenever x is NaN (i.e., if x != x).
F is nondecreasing, i.e., if x’ < x then F(x’) <= F(x).
- Use for a Survival Function
When used as a SF, the function returns a float representing \(\Pr(X > x)\) for the generated random variate \(X\). To be a valid specification, S must satisfy the following properties:
For all floats x, S(x) is a float in the unit interval [0, 1].
S(x) = 0, whenever x is NaN (i.e., if x != x).
S is nonincreasing, i.e., if x’ < x then S(x) <= S(x’).
Note
According to the above specification, probability distributions are permitted to include atoms at special floating-point values
-INFINITY
,INFINITY
, andNAN
. Using a CDF F, this behavior can be achieved as follows.Atom at
-INFINITY
: letF(-INFINITY) > 0
.Atom at
-INFINITY
: letF(nextafterf(INFINITY, 0)) < F(INFINITY)
.Atom at
NAN
: letF(-INFINITY) < 1
.
It is also possible to specify separate atoms at the negative and positive zeros (
0+
,0-
) of floating-point.
An advanced feature of librvg is the notion of a “dual distribution function” (DDF), which combines a numerical CDF and SF to enable extended accuracy sampling. This representation is described in the Guarantees and in [SL25, §6]. The type of a DDF is given in as
-
typedef void (*ddf32_t)(double x, bool *b, float *p);
This function takes as input a double
x
, and writes the result of evaluating a DDF atx
to the pointersb
andp
. The interpretation ofb
andp
are as follows:If
b
is set to 0, then \(\Pr(X \le x) = p\)If
b
is set to 1, then \(\Pr(X \le x) = 1 - p\)
Functions that operate using a DDF have an _ext
suffix (meaning “extended”).
Wrapping a Distribution
librvg can interporate with existing CDF or SF implementations from the
GNU Scientific Library.
The following examples show how to wrap a Normal(0,1) and Poisson(5)
distributions to create new functions that are compatible with
with cdf32_t
and ddf32_t
.
// Gaussian distribution from the GNU library (continuous).
MAKE_CDF_P(gaussian_cdf, gsl_cdf_gaussian_P, 1);
MAKE_CDF_Q(gaussian_sf, gsl_cdf_gaussian_Q, 1);
MAKE_DDF(gaussian_ddf, gaussian_cdf, gaussian_sf);
// EXAMPLE 2: Poisson distribution from the GNU library (discrete).
MAKE_CDF_UINT_P(poisson_cdf, gsl_cdf_poisson_P, 5);
MAKE_CDF_UINT_Q(poisson_sf, gsl_cdf_poisson_Q, 5);
MAKE_DDF(poisson_ddf, poisson_cdf, poisson_sf);
The following macros are available, where
name
should be a fresh C identifier (not a string)
for the name of the new function created from func
. The suffixes
_P
and _Q
correspond to a CDF and SF, respectively, a naming
convention inherited from the GSL. The ...
varargs are passed
directly to func
. The MAKE_DDF()
macro is not specific to a
CDF or SF created from the GSL, it can be used for any such pairing.
-
MAKE_CDF_P(name, func, ...)
Make a cumulative distribution over doubles from the GSL.
-
MAKE_CDF_Q(name, func, ...)
Make a survival distribution over doubles from the GSL.
-
MAKE_CDF_UINT_P(name, func, ...)
Make a cumulative distribution over unsigned integers from the GSL.
-
MAKE_CDF_UINT_Q(name, func, ...)
Make a survival distribution over unsigned integers from the GSL.
-
MAKE_DDF(name, func_cdf, func_sf)
Make a dual distribution function.
-
float cdf_discrete(double x, const float *P, size_t K);
A cumulative distribution function (
cdf32_t
) for arbitrary discrete distributions over nonnegative integers. It is parameterized by a length-K
input arrayP
of cumulative probabilities, whereP[i]
is the cumulative probability of integeri
. Available indiscrete.h
.
Generating Random Variates
Entropy-Optimal Generation
The generators are the entropy-optimal algorithms from [SL25].
They take as input the target cdf
as described in the
previous section as well as a
prng
, which is described in Pseudorandom Number Generators.
-
double generate_opt(cdf32_t cdf, struct flip_state *prng)
Generate random variables optimally from
cdf
.
-
double generate_opt_ext(ddf32_t ddf, struct flip_state *prng)
Generate random variables optimally from
ddf
.
Conditional-Bit Generation
librvg also includes implementations of the Conditional Bit Sampling (CBS)
method. These algorithms are described in Sobolewski and Payne [SP72, Section II].
The performance of these functions is generally inferior to that of
generate_opt()
and generate_opt_ext()
in terms
of entropy consumption and runtime, as they use arbitrary precision
arithmetic via the GNU MP Bignum Library (libgmp).
-
double generate_cbs(cdf32_t cdf, struct flip_state *prng)
Generate random variables from
cdf
using Conditional Bit Sampling.
-
double generate_cbs_ext(ddf32_t ddf, struct flip_state *prng)
Generate random variables from
ddf
using Conditional Bit Sampling.
Querying a CDF
-
double quantile(cdf32_t cdf, float q)
-
double quantile_sf(cdf32_t sf, float q)
-
double quantile_ext(ddf32_t ddf, bool d, float q)
These functions are used to compute the exact quantiles of a CDF, SF, or DDF. Recall that the quantile \(Q\) of a cumulative distribution function \(F\) is its is a generalized inverse:
\[\begin{aligned} Q(u) = \inf\{ x \in \mathbb{R} \mid u \le F(x) \} && (u \in [0,1]). \end{aligned}\]For a numerical function \(\texttt{cdf}: \mathbb{F} \to \mathbb{F} \cap [0,1]\), the exact quantile is
\[\begin{aligned} \texttt{quantile}(u) = \min\{ f \in \mathbb{F} \mid f \le \texttt{cdf}(x) \} && (u \in \mathbb{F} \cap [0,1]). \end{aligned}\]
-
void bounds_quantile(cdf32_t cdf, double *xlo, double *xhi);
-
void bounds_quantile_sf(cdf32_t sf, double *xlo, double *xhi);
-
void bounds_quantile_ext(ddf32_t ddf, double *xlo, double *xhi);
These functions compute the values of the smallest and largest atoms of a distribution. The results are stored in the output parameters
xlo
andxhi
.
Pseudorandom Number Generators
Available in prng.h
and flip.h
Flip States
The generators described above
take as input a pseudorandom number generator called prng
, which
must be obtained using make_flip_state()
. The usual pattern is to
make a flip state using a gsl_rng
pointer from the
GNU Scientific Library,
as follows.
// Prepare the random number generator.
gsl_rng * rng = gsl_rng_alloc(gsl_rng_default);
struct flip_state prng = make_flip_state(rng);
-
struct flip_state
A struct that maintains the state of a sequence of flips.
-
struct flip_state make_flip_state(gsl_rng *rng)
Initialize a
flip_state
using anrng
.
Random bits are obtained from a flip_state
using the following
functions.
-
unsigned char flip(struct flip_state *s)
Generate a single bit from a
flip_state
.
-
unsigned long long flip_k(struct flip_state *s, int k)
Generate a random
k
-bit number from aflip_state
.
-
unsigned long long randint(struct flip_state *s, int k)
Generate a random
k
-bit number from aflip_state
.
Additional PRNGs
There are a large number of pseudorandom number generators in the GSL, which can be used out of the box. librvg also provides two additional PRNG types.
-
extern const gsl_rng_type *gsl_rng_urandom
This generator draws system-level entropy using the getrandom syscall, which is typically
/dev/urandom
. It maintains no state itself and cannot be seeded by the user, so calls such asgsl_rng_set
have no effect. Used primarily for workflows that require cryptographically secure random bits, see [FederalOfISecurity23] for a detailed description of the system entropy source.
-
extern const gsl_rng_type *gsl_rng_deterministic
This generator deterministically returns its seed. Its state consists of a single
unsigned long int
value (typically 32 bits) which is always returned. It is useful for debugging and characterizing the properties of generators.