private static class BM25Similarity.BM25Scorer extends Similarity.SimScorer
Modifier and Type | Field and Description |
---|---|
private float |
avgdl
The average document length.
|
private float |
b
b value for length normalization impact
|
private float |
boost
query boost
|
private float[] |
cache
precomputed norm[256] with k1 * ((1 - b) + b * dl / avgdl)
|
private Explanation |
idf
BM25's idf
|
private float |
k1
k1 value for scale factor
|
private float |
weight
weight (idf * boost)
|
Constructor and Description |
---|
BM25Scorer(float boost,
float k1,
float b,
Explanation idf,
float avgdl,
float[] cache) |
Modifier and Type | Method and Description |
---|---|
Explanation |
explain(Explanation freq,
long encodedNorm)
Explain the score for a single document
|
private java.util.List<Explanation> |
explainConstantFactors() |
private Explanation |
explainTF(Explanation freq,
long norm) |
float |
score(float freq,
long encodedNorm)
Score a single document.
|
private final float boost
private final float k1
private final float b
private final Explanation idf
private final float avgdl
private final float[] cache
private final float weight
BM25Scorer(float boost, float k1, float b, Explanation idf, float avgdl, float[] cache)
public float score(float freq, long encodedNorm)
Similarity.SimScorer
freq
is the document-term sloppy
frequency and must be finite and positive. norm
is the
encoded normalization factor as computed by
Similarity.computeNorm(FieldInvertState)
at index time, or
1
if norms are disabled. norm
is never 0
.
Score must not decrease when freq
increases, ie. if
freq1 > freq2
, then score(freq1, norm) >=
score(freq2, norm)
for any value of norm
that may be produced
by Similarity.computeNorm(FieldInvertState)
.
Score must not increase when the unsigned norm
increases, ie. if
Long.compareUnsigned(norm1, norm2) > 0
then
score(freq, norm1) <= score(freq, norm2)
for any legal
freq
.
As a consequence, the maximum score that this scorer can produce is bound
by score(Float.MAX_VALUE, 1)
.
score
in class Similarity.SimScorer
freq
- sloppy term frequency, must be finite and positiveencodedNorm
- encoded normalization factor or 1
if norms are disabledpublic Explanation explain(Explanation freq, long encodedNorm)
Similarity.SimScorer
explain
in class Similarity.SimScorer
freq
- Explanation of how the sloppy term frequency was computedencodedNorm
- encoded normalization factor, as returned by Similarity.computeNorm(org.apache.lucene.index.FieldInvertState)
, or 1
if norms are disabledprivate Explanation explainTF(Explanation freq, long norm)
private java.util.List<Explanation> explainConstantFactors()