This module contains classes for scoring (and sorting) search results.
Abstract base class for scoring models. A WeightingModel object provides a method, scorer, which returns an instance of whoosh.scoring.Scorer.
Basically, WeightingModel objects store the configuration information for the model (for example, the values of B and K1 in the BM25F model), and then creates a scorer instance based on additional run-time information (the searcher, the fieldname, and term text) to do the actual scoring.
Returns a final score for each document. You can use this method in subclasses to apply document-level adjustments to the score, for example using the value of stored field to influence the score (although that would be slow).
WeightingModel sub-classes that use final() should have the attribute use_final set to True.
Parameters: |
|
---|---|
Return type: | float |
Returns the inverse document frequency of the given term.
Returns an instance of whoosh.scoring.Scorer configured for the given searcher, fieldname, and term text.
Base class for “scorer” implementations. A scorer provides a method for scoring a document, and sometimes methods for rating the “quality” of a document and a matcher’s current “block”, to implement quality-based optimizations.
Scorer objects are created by WeightingModel objects. Basically, WeightingModel objects store the configuration information for the model (for example, the values of B and K1 in the BM25F model), and then creates a scorer instance.
Returns the maximum possible score the matcher can give in its current “block” (whatever concept of “block” the backend might use). If this score is less than the minimum score required to make the “top N” results, then we can tell the matcher to skip ahead to another block with better “quality”.
Returns a score for the current document of the matcher.
Returns True if this class supports quality optimizations.
A scorer that simply returns the weight as the score. This is useful for more complex weighting models to return when they are asked for a scorer for fields that aren’t scorable (don’t store field lengths).
Base class for scorers where the only per-document variables are term weight and field length.
Subclasses should override the _score(weight, length) method to return the score for a document with the given weight and length, and call the setup() method at the end of the initializer to set up common attributes.
Implements the BM25F scoring algorithm.
>>> from whoosh import scoring
>>> # Set a custom B value for the "content" field
>>> w = scoring.BM25F(B=0.75, content_B=1.0, K1=1.5)
Parameters: |
|
---|
Uses a supplied function to do the scoring. For simple scoring functions and experiments this may be simpler to use than writing a full weighting model class and scorer class.
The function should accept the arguments searcher, fieldname, text, matcher.
For example, the following function will score documents based on the earliest position of the query term in the document:
def pos_score_fn(searcher, fieldname, text, matcher):
poses = matcher.value_as("positions")
return 1.0 / (poses[0] + 1)
pos_weighting = scoring.FunctionWeighting(pos_score_fn)
with myindex.searcher(weighting=pos_weighting) as s:
results = s.search(q)
Note that the searcher passed to the function may be a per-segment searcher for performance reasons. If you want to get global statistics inside the function, you should use searcher.get_parent() to get the top-level searcher. (However, if you are using global statistics, you should probably write a real model/scorer combo so you can cache them on the object.)
Chooses from multiple scoring algorithms based on the field.
The only non-keyword argument specifies the default Weighting instance to use. Keyword arguments specify Weighting instances for specific fields.
For example, to use BM25 for most fields, but Frequency for the id field and TF_IDF for the keys field:
mw = MultiWeighting(BM25(), id=Frequency(), keys=TF_IDF())
Parameters: | default – the Weighting instance to use for fields not specified in the keyword arguments. |
---|
Wraps a weighting object and subtracts the wrapped model’s scores from 0, essentially reversing the weighting model.