Overview

librvg is an open-source C library for generating random variables. It is released under the Apache-2.0 License. The methods implemented in librvg are described in the following publication.

  • Feras A. Saad and Wonyeol Lee. Random variate generation with formal guarantees. Proceedings of the ACM on Programming Languages, June 2025. doi:10.1145/3729251.

For the latest version, please visit https://github.com/probsys/librvg.

Attention

This library is research-grade software and is intended for research and educational purposes.

  • It may contain errors, inefficiencies, or incomplete features.

  • It has not been tested for production use or under real-world workloads.

Users should take appropriate precautions when using this library. Please report questions and errors at https://github.com/probsys/librvg/issues/

Purpose

The purpose of librvg is to efficiently and automatically generate exact random variates from a given probability distribution, which is formally specified using a numerical implementation of one or both of its cumulative distribution function (CDF) or survival function (SF).

By working with a formal specification of the desired probability distribution of the random variates, the generators are equipped with a precise guarantee on their output distributions. This feature can be contrasted with existing C libraries for random variate generation:

GNU Scientific Library (GSL). This library provides CDF/SF implementations of various probability distributions as well as separate implementation of generators for these distributions. In general, there is no precise relationship between the CDF, the SF, and the generator―which can lead to unpredictable results. In addition, the true output distributions of generators are often difficult or intractable to quantify exactly.

UNU.RAN. This library generates random variables given a CDF and includes universal (automated) algorithms that apply to any user-specified distribution. However, it does not guarantee exact samples.

The exact and automated algorithms in librvg generally operate more slowly than more specialized, approximate algorithms from the above libraries. This library is most suitable for situations where rigorous guarantees are needed on the exact output distribution and runtime characteristics of the generator. It is also suitable when the entropy (or amount of randomness) consumed by the algorithm is an important resource to be minimized; see Optimality.

Guarantees

The algorithms in librvg cohere with the specified CDF (or SF) interpreted as a floating-point function. In particular, given a numerical implementation \(\texttt{cdf}: \mathbb{F} \to \mathbb{F} \cap [0,1]\) of a CDF that maps a floating-point number to a float between 0 and 1, a random variate \(X\) is generated that satisfies

\[\begin{aligned} \Pr(X \le f) = \texttt{cdf}(f) && (f \in \mathbb{F}). \end{aligned}\]

If a numerical implementation \(\texttt{sf}: \mathbb{F} \to \mathbb{F} \cap [0,1]\) of an SF is given instead, then the guarantee is

\[\begin{aligned} \Pr(X > f) = \texttt{sf}(f) && (f \in \mathbb{F}). \end{aligned}\]

An advanced feature of librvg is the notion of a dual distribution function (DDF), which combines a CDF and SF to obtain a more accurate representation of the target distribution than can be achieved using either a CDF or SF in isolation. The DDF addresses the limitation of floating-point probabilities, which are very dense near 0 but sparse near 1. Informally, the property that a DDF guarantees is

\[\begin{split}\begin{aligned} \Pr(X \le f) = \begin{cases} \texttt{cdf}(f) & \mbox{if } f \le f^* \\ 1 -_{\mathbb{R}} \texttt{sf}(f) & \mbox{if } f^* < f \end{cases} && (f \in \mathbb{F}), \end{aligned}\end{split}\]

where \(f^*\) is a cutoff-point corresponding to the exact floating-point median of \(cdf\); and the subtraction in the second case is (i.e., not subject to floating-point error). Additional details of this representation are available in [SL25, §6].

Optimality

The library includes generators that are entropy optimal in the sense of Knuth and Yao [KY76], meaning that they consume the least amount of random bits (on average) to produce a random variate, among the class of all exact generators for the given CDF.

Programming with librvg

The following program gives a brief synopsis of how to get started with the main tasks supported by librvg. The API Reference contains more detailed information the functions.

 1#include <gsl/gsl_rng.h>
 2#include "rvg/generate.h"
 3
 4int main(int argc, char * argv[]) {
 5
 6  // Prepare the random number generator.
 7  gsl_rng * rng = gsl_rng_alloc(gsl_rng_default);
 8  struct flip_state prng = make_flip_state(rng);
 9
10  // Implement A custom continuous distribution, F(x) = x^2 over [0,1].
11  float cdf_x2(double x) {
12      if      (x != x)        { return 1; }       // nan
13      else if (signbit(x))    { return 0; }       // <= -0.0
14      else if (1 <= x)        { return 1; }       // >= 1
15      else                    { return x*x; }     // x^2 over [0,1]
16  }
17
18  // Draw a random variate from cdf_x2.
19  double sample = generate_opt(cdf_x2, &prng);
20  printf("The sample is: %f\n", sample);
21
22  // Compute some quantiles of cdf_x2.
23  double median = quantile(cdf_x2, 0.5);
24  double quantile_25 = quantile(cdf_x2, 0.25);
25  double quantile_75 = quantile(cdf_x2, 0.75);
26  printf("The quantiles are: %f %f %f\n", quantile_25, median, quantile_75);
27
28  // Free the random number generator.
29  gsl_rng_free(rng);
30}

The primary function provided by librvg is generate_opt(), which takes as input a CDF implementation and a pseudorandom number generator (prng), and returns as output an exact random variate drawn from that CDF.