public final class HLLUtil extends Object
Constructor and Description |
---|
HLLUtil() |
Modifier and Type | Method and Description |
---|---|
static double |
alphaMSquared(int m)
Computes the 'alpha-m-squared' constant used by the HyperLogLog algorithm.
|
static double |
largeEstimator(int log2m,
int registerSizeInBits,
double estimator)
The "large range correction" formula from the HyperLogLog algorithm, adapted
for 64 bit hashes.
|
static double |
largeEstimatorCutoff(int log2m,
int registerSizeInBits)
The cutoff for using the "large range correction" formula, from the
HyperLogLog algorithm, adapted for 64 bit hashes.
|
static long |
pwMaxMask(int registerSizeInBits)
Computes a mask that prevents overflow of HyperLogLog registers.
|
static int |
registerBitSize(long expectedUniqueElements)
Computes the bit-width of HLL registers necessary to estimate a set of
the specified cardinality.
|
static double |
smallEstimator(int m,
int numberOfZeroes)
The "small range correction" formula from the HyperLogLog algorithm.
|
static double |
smallEstimatorCutoff(int m)
The cutoff for using the "small range correction" formula, in the
HyperLogLog algorithm.
|
public static int registerBitSize(long expectedUniqueElements)
expectedUniqueElements
- an upper bound on the number of unique
elements that are expected. This must be greater than zero.log2(log2(n))
)public static double alphaMSquared(int m)
m
- this must be a power of two, cannot be less than
16 (24), and cannot be greater than 65536 (216).registerCount
squared where gamma is
based on the value of registerCount
.IllegalArgumentException
- if registerCount
is less
than 16.public static long pwMaxMask(int registerSizeInBits)
registerSizeInBits
- the size of the HLL registers, in bits.long
mask to prevent overflow of the registersregisterBitSize(long)
public static double smallEstimatorCutoff(int m)
m
- the number of registers in the HLL. m in the paper.smallEstimator(int, int)
public static double smallEstimator(int m, int numberOfZeroes)
(5/2) * mand there are still registers that have the zero value.
m
- the number of registers in the HLL. m in the paper.numberOfZeroes
- the number of registers with value zero. V
in the paper.public static double largeEstimatorCutoff(int log2m, int registerSizeInBits)
log2m
- log-base-2 of the number of registers in the HLL. b in the paper.registerSizeInBits
- the size of the HLL registers, in bits.largeEstimator(int, int, double)
,
Blog post with section on 64 bit hashes and "large range correction" cutoffpublic static double largeEstimator(int log2m, int registerSizeInBits, double estimator)
largeEstimatorCutoff(int, int)
.log2m
- log-base-2 of the number of registers in the HLL. b in the paper.registerSizeInBits
- the size of the HLL registers, in bits.estimator
- the original estimator ("E" in the paper).Copyright © 2016. All rights reserved.