distributions
All of the distributions that are provided in the Apache Commons Math project are supported here, in multiple forms.
Continuous or Discrete
These distributions break down into two main categories:
Continuous Distributions
These are distributions over real numbers like 23.4323, with continuity across the values. Each of the continuous distributions can provide samples that fall on an interval of the real number line. Continuous probability distributions include the Normal distribution, and the Exponential distribution, among many others.
Discrete Distributions
Discrete distributions, also known as integer distributions have only whole-number valued samples. These distributions include the Binomial distribution, the Zipf distribution, and the Poisson distribution, among others.
Hashed or Mapped
hashed samples
Generally, you will want to “randomly sample” from a probability distribution.
This is handled automatically by the functions below if you do not override the
defaults. The hash
mode is the default sampling mode for probability
distributions. This is accomplished by computing an internal on the unit
interval variate input before using the resulting value to map into the sampling
curve. This is called the hash
sampling mode by VirtData. You can put hash
into the modifiers as explained below if you want to document it explicitly.
mapped samples
The method used to sample from these distributions depends on a mathematical
function called the cumulative probability function, or more specifically
the inverse of it. Having this function computed over some interval allows
one to sample the shape of a distribution progressively if desired. In
other words, it allows for some percentile-like view of values within
a given probability distribution. This mode of using the inverse cumulative
density function is known as the map
mode in VirtData, as it allows one
to map a unit interval variate in a deterministic way to a density
sampling curve. To enable this mode, simply pass map
as one of the
function modifiers for any function in this category.
Interpolated or Computed Samples
When sampling from mathematical models of probability densities, performance between different densities can vary drastically. This means that you may end up perturbing the results of your test in an unexpected way simply by changing parameters of your testing distributions. Even worse, some densities have painful corner cases in performance, like ‘Zipf’, which can make tests unbearably slow and flawed as they chew up CPU resources.
Interpolated Samples
For this reason, interpolation is built-in to these sampling functions.
The default mode is interpolate
. This means that the sampling
function is pre-computed over 1000 equidistant points in the unit interval,
and the result is shared among all threads as a look-up-table for
interpolation. This makes all statistical sampling functions perform nearly
identically at runtime (after initialization, a one time cost).
This does have the minor side effect of a little loss in accuracy, but
the difference is generally negligible for nearly all performance testing
cases.
Computed Samples
Conversely, compute
mode sampling calls the sampling function every
time a sample is needed. This affords a little more accuracy, but is generally
not preferable to the default interpolated mode. You’ll know if you need
computed samples. Otherwise, it’s best to stick with interpolation so that
you spend more time testing your target system and less time testing
your data generation functions.
Input Range
All of these functions take a long as the input value for sampling. This is similar to how the unit interval (0.0,1.0) is used in mathematics and statistics, but more tailored to modern system capabilities. Instead of using the unit interval, we simply use the interval of all positive longs. This provides more compatibility with other functions in VirtData, including hashing functions.
Beta
See Wikipedia: Beta distribution
See Commons JavaDoc: BetaDistribution
- int -> Beta(double: alpha, double: beta, String… mods) -> double
- long -> Beta(double: alpha, double: beta, String… mods) -> double
Binomial
See Wikipedia: Binomial distribution
See Commons JavaDoc: BinomialDistribution
- int -> Binomial(int: trials, double: p, String… modslist) -> int
- int -> Binomial(int: trials, double: p, String… modslist) -> long
- long -> Binomial(int: trials, double: p, String… modslist) -> int
- long -> Binomial(int: trials, double: p, String… modslist) -> long
Cauchy
See Wikipedia: Cauchy_distribution
See Commons Javadoc: CauchyDistribution
- int -> Cauchy(double: median, double: scale, String… mods) -> double
- long -> Cauchy(double: median, double: scale, String… mods) -> double
ChiSquared
See Wikipedia: Chi-squared distribution
See Commons JavaDoc: ChiSquaredDistribution
- int -> ChiSquared(double: degreesOfFreedom, String… mods) -> double
- long -> ChiSquared(double: degreesOfFreedom, String… mods) -> double
ConstantContinuous
Always yields the same value.
See Commons JavaDoc: ConstantContinuousDistribution
- int -> ConstantContinuous(double: value, String… mods) -> double
- long -> ConstantContinuous(double: value, String… mods) -> double
Enumerated
Creates a probability density given the values and optional weights provided, in “value:weight value:weight …” form. The weight can be elided for any value to use the default weight of 1.0d.
See Commons JavaDoc: EnumeratedRealDistribution
- int -> Enumerated(String: data, String… mods) -> double
- ex:
Enumerated('1 2 3 4 5 6')
- a fair six-sided die roll - ex:
Enumerated('1:2.0 2 3 4 5 6')
- an unfair six-sided die roll, where 1 has probability mass 2.0, and everything else has only 1.0
- ex:
- long -> Enumerated(String: data, String… mods) -> double
- ex:
Enumerated('1 2 3 4 5 6')
- a fair 6-sided die - ex:
Enumerated('1:2.0 2 3 4 5:0.5 6:0.5')
- an unfair fair 6-sided die, where ones are twice as likely, and fives and sixes are half as likely
- ex:
Exponential
See Wikipedia: Exponential distribution
See Commons JavaDoc: ExponentialDistribution
- int -> Exponential(double: mean, String… mods) -> double
- long -> Exponential(double: mean, String… mods) -> double
F
See Commons JavaDoc: FDistribution
- int -> F(double: numeratorDegreesOfFreedom, double: denominatorDegreesOfFreedom, String… mods) -> double
- long -> F(double: numeratorDegreesOfFreedom, double: denominatorDegreesOfFreedom, String… mods) -> double
Gamma
See Wikipedia: Gamma distribution
See Commons JavaDoc: GammaDistribution
- int -> Gamma(double: shape, double: scale, String… mods) -> double
- long -> Gamma(double: shape, double: scale, String… mods) -> double
Geometric
See Wikipedia: Geometric distribution
See Commons JavaDoc: GeometricDistribution
- int -> Geometric(double: p, String… modslist) -> int
- int -> Geometric(double: p, String… modslist) -> long
- long -> Geometric(double: p, String… modslist) -> int
- long -> Geometric(double: p, String… modslist) -> long
Gumbel
See Wikipedia: Gumbel distribution
See Commons JavaDoc: GumbelDistribution
- int -> Gumbel(double: mu, double: beta, String… mods) -> double
- long -> Gumbel(double: mu, double: beta, String… mods) -> double
Hypergeometric
See Wikipedia: Hypergeometric distribution
See Commons JavaDoc: HypergeometricDistribution
- int -> Hypergeometric(int: populationSize, int: numberOfSuccesses, int: sampleSize, String… modslist) -> int
- int -> Hypergeometric(int: populationSize, int: numberOfSuccesses, int: sampleSize, String… modslist) -> long
- long -> Hypergeometric(int: populationSize, int: numberOfSuccesses, int: sampleSize, String… modslist) -> int
- long -> Hypergeometric(int: populationSize, int: numberOfSuccesses, int: sampleSize, String… modslist) -> long
Laplace
See Wikipedia: Laplace distribution
See Commons JavaDoc: LaplaceDistribution
- int -> Laplace(double: mu, double: beta, String… mods) -> double
- long -> Laplace(double: mu, double: beta, String… mods) -> double
Levy
See Wikipedia: Lévy distribution
See Commons JavaDoc: LevyDistribution
- int -> Levy(double: mu, double: c, String… mods) -> double
- long -> Levy(double: mu, double: c, String… mods) -> double
LogNormal
See Wikipedia: Log-normal distribution
See Commons JavaDoc: LogNormalDistribution
- int -> LogNormal(double: scale, double: shape, String… mods) -> double
- long -> LogNormal(double: scale, double: shape, String… mods) -> double
Logistic
See Wikipedia: Logistic distribution
See Commons JavaDoc: LogisticDistribution
- int -> Logistic(double: mu, double: scale, String… mods) -> double
- long -> Logistic(double: mu, double: scale, String… mods) -> double
Nakagami
See Wikipedia: Nakagami distribution
See Commons JavaDoc: NakagamiDistribution
- int -> Nakagami(double: mu, double: omega, String… mods) -> double
- long -> Nakagami(double: mu, double: omega, String… mods) -> double
Normal
See Wikipedia: Normal distribution
See Commons JavaDoc: NormalDistribution
- int -> Normal(double: mean, double: sd, String… mods) -> double
- long -> Normal(double: mean, double: sd, String… mods) -> double
Pareto
See Wikipedia: Pareto distribution
See Commons JavaDoc: ParetoDistribution
- int -> Pareto(double: scale, double: shape, String… mods) -> double
- long -> Pareto(double: scale, double: shape, String… mods) -> double
Pascal
See Commons JavaDoc: PascalDistribution
See Wikipedia: Negative binomial distribution
- int -> Pascal(int: r, double: p, String… modslist) -> int
- int -> Pascal(int: r, double: p, String… modslist) -> long
- long -> Pascal(int: r, double: p, String… modslist) -> int
- long -> Pascal(int: r, double: p, String… modslist) -> long
Poisson
See Wikipedia: Poisson distribution
See Commons JavaDoc: PoissonDistribution
- int -> Poisson(double: p, String… modslist) -> int
- int -> Poisson(double: p, String… modslist) -> long
- long -> Poisson(double: p, String… modslist) -> int
- long -> Poisson(double: p, String… modslist) -> long
T
See Wikipedia: Student’s t-distribution
See Commons JavaDoc: TDistribution
- int -> T(double: degreesOfFreedom, String… mods) -> double
- long -> T(double: degreesOfFreedom, String… mods) -> double
Triangular
See Wikipedia: Triangular distribution
See Commons JavaDoc: TriangularDistribution
- int -> Triangular(double: a, double: c, double: b, String… mods) -> double
- long -> Triangular(double: a, double: c, double: b, String… mods) -> double
Uniform
See Wikipedia: Uniform distribution (continuous)
See Commons JavaDoc: UniformContinuousDistribution
- int -> Uniform(double: lower, double: upper, String… mods) -> double
- long -> Uniform(double: lower, double: upper, String… mods) -> double
- int -> Uniform(int: lower, int: upper, String… modslist) -> int
- int -> Uniform(int: lower, int: upper, String… modslist) -> long
- long -> Uniform(int: lower, int: upper, String… modslist) -> int
- long -> Uniform(int: lower, int: upper, String… modslist) -> long
Weibull
See Wikipedia: Weibull distribution
See Wolfram Mathworld: Weibull Distribution
See Commons Javadoc: WeibullDistribution
- int -> Weibull(double: alpha, double: beta, String… mods) -> double
- long -> Weibull(double: alpha, double: beta, String… mods) -> double
Zipf
See Commons JavaDoc: ZipfDistribution
- int -> Zipf(int: numberOfElements, double: exponent, String… modslist) -> int
- int -> Zipf(int: numberOfElements, double: exponent, String… modslist) -> long
- long -> Zipf(int: numberOfElements, double: exponent, String… modslist) -> int
- long -> Zipf(int: numberOfElements, double: exponent, String… modslist) -> long