final class IndexedDISI extends DocIdSetIterator
DocIdSetIterator
which can return
the index of the current document, i.e. the ordinal of the current document
among the list of documents that this iterator can return. This is useful
to implement sparse doc values by only having to encode values for documents
that actually have a value.
Implementation-wise, this DocIdSetIterator
is inspired of
roaring bitmaps
and encodes ranges of 65536
documents independently and picks between 3 encodings depending on the
density of the range:
ALL
if the range contains 65536 documents exactly,
DENSE
if the range contains 4096 documents or more; in that
case documents are stored in a bit set,
SPARSE
otherwise, and the lower 16 bits of the doc IDs are
stored in a short
.
Only ranges that contain at least one value are encoded.
This implementation uses 6 bytes per document in the worst-case, which happens in the case that all ranges contain exactly one document.
Modifier and Type | Class and Description |
---|---|
(package private) static class |
IndexedDISI.Method |
Modifier and Type | Field and Description |
---|---|
private int |
block |
private long |
blockEnd |
private long |
cost |
private int |
doc |
(package private) boolean |
exists |
private int |
gap |
private int |
index |
(package private) static int |
MAX_ARRAY_LENGTH |
(package private) IndexedDISI.Method |
method |
private int |
nextBlockIndex |
private int |
numberOfOnes |
private IndexInput |
slice
The slice that stores the
DocIdSetIterator . |
private long |
word |
private int |
wordIndex |
NO_MORE_DOCS
Constructor and Description |
---|
IndexedDISI(IndexInput slice,
long cost) |
IndexedDISI(IndexInput in,
long offset,
long length,
long cost) |
Modifier and Type | Method and Description |
---|---|
int |
advance(int target)
Advances to the first beyond the current whose document number is greater
than or equal to target, and returns the document number itself.
|
private void |
advanceBlock(int targetBlock) |
boolean |
advanceExact(int target) |
long |
cost()
Returns the estimated cost of this
DocIdSetIterator . |
int |
docID()
Returns the following:
-1 if DocIdSetIterator.nextDoc() or
DocIdSetIterator.advance(int) were not called yet. |
private static void |
flush(int block,
FixedBitSet buffer,
int cardinality,
IndexOutput out) |
int |
index() |
int |
nextDoc()
Advances to the next document in the set and returns the doc it is
currently on, or
DocIdSetIterator.NO_MORE_DOCS if there are no more docs in the
set.NOTE: after the iterator has exhausted you should not call this method, as it may result in unpredicted behavior. |
private void |
readBlockHeader() |
(package private) static void |
writeBitSet(DocIdSetIterator it,
IndexOutput out) |
all, empty, range, slowAdvance
static final int MAX_ARRAY_LENGTH
private final IndexInput slice
DocIdSetIterator
.private final long cost
private int block
private long blockEnd
private int nextBlockIndex
IndexedDISI.Method method
private int doc
private int index
boolean exists
private long word
private int wordIndex
private int numberOfOnes
private int gap
IndexedDISI(IndexInput in, long offset, long length, long cost) throws java.io.IOException
java.io.IOException
IndexedDISI(IndexInput slice, long cost) throws java.io.IOException
java.io.IOException
private static void flush(int block, FixedBitSet buffer, int cardinality, IndexOutput out) throws java.io.IOException
java.io.IOException
static void writeBitSet(DocIdSetIterator it, IndexOutput out) throws java.io.IOException
java.io.IOException
public int docID()
DocIdSetIterator
-1
if DocIdSetIterator.nextDoc()
or
DocIdSetIterator.advance(int)
were not called yet.
DocIdSetIterator.NO_MORE_DOCS
if the iterator has exhausted.
docID
in class DocIdSetIterator
public int advance(int target) throws java.io.IOException
DocIdSetIterator
DocIdSetIterator.NO_MORE_DOCS
if target
is greater than the highest document number in the set.
The behavior of this method is undefined when called with
target ≤ current
, or after the iterator has exhausted.
Both cases may result in unpredicted behavior.
When target > current
it behaves as if written:
int advance(int target) { int doc; while ((doc = nextDoc()) < target) { } return doc; }Some implementations are considerably more efficient than that.
NOTE: this method may be called with DocIdSetIterator.NO_MORE_DOCS
for
efficiency by some Scorers. If your implementation cannot efficiently
determine that it should exhaust, it is recommended that you check for that
value in each call to this method.
advance
in class DocIdSetIterator
java.io.IOException
public boolean advanceExact(int target) throws java.io.IOException
java.io.IOException
private void advanceBlock(int targetBlock) throws java.io.IOException
java.io.IOException
private void readBlockHeader() throws java.io.IOException
java.io.IOException
public int nextDoc() throws java.io.IOException
DocIdSetIterator
DocIdSetIterator.NO_MORE_DOCS
if there are no more docs in the
set.nextDoc
in class DocIdSetIterator
java.io.IOException
public int index()
public long cost()
DocIdSetIterator
DocIdSetIterator
.
This is generally an upper bound of the number of documents this iterator might match, but may be a rough heuristic, hardcoded value, or otherwise completely inaccurate.
cost
in class DocIdSetIterator