org.apache.lucene.index
public abstract class IndexReader extends Object
Concrete subclasses of IndexReader are usually constructed with a call to
one of the static open()
methods, e.g. {@link #open(String)}.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.
Version: $Id: IndexReader.java 358685 2005-12-23 02:38:23Z yonik $
Nested Class Summary | |
---|---|
static class | IndexReader.FieldOption |
Constructor Summary | |
---|---|
protected | IndexReader(Directory directory)
Constructor used if IndexReader is not owner of its directory.
|
Method Summary | |
---|---|
void | close()
Closes files associated with this index.
|
protected void | commit()
Commit changes resulting from delete, undeleteAll, or setNorm operations
|
void | delete(int docNum) Deletes the document numbered docNum . |
int | delete(Term term) Deletes all documents containing term .
|
void | deleteDocument(int docNum) Deletes the document numbered docNum . |
int | deleteDocuments(Term term) Deletes all documents containing term .
|
Directory | directory() Returns the directory this index resides in. |
abstract int | docFreq(Term t) Returns the number of documents containing the term t . |
abstract Document | document(int n) Returns the stored fields of the n th
Document in this index. |
protected abstract void | doClose() Implements close. |
protected abstract void | doCommit() Implements commit. |
protected abstract void | doDelete(int docNum) Implements deletion of the document numbered docNum .
|
protected abstract void | doSetNorm(int doc, String field, byte value) Implements setNorm in subclass. |
protected abstract void | doUndeleteAll() Implements actual undeleteAll() in subclass. |
protected void | finalize() Release the write lock, if needed. |
static long | getCurrentVersion(String directory)
Reads version number from segments files. |
static long | getCurrentVersion(File directory)
Reads version number from segments files. |
static long | getCurrentVersion(Directory directory)
Reads version number from segments files. |
abstract Collection | getFieldNames()
Returns a list of all unique field names that exist in the index pointed
to by this IndexReader. |
abstract Collection | getFieldNames(boolean indexed)
Returns a list of all unique field names that exist in the index pointed
to by this IndexReader. |
abstract Collection | getFieldNames(IndexReader.FieldOption fldOption)
Get a list of unique field names that exist in this index and have the specified
field option information. |
Collection | getIndexedFieldNames(boolean storedTermVector) |
abstract Collection | getIndexedFieldNames(Field.TermVector tvSpec)
Get a list of unique field names that exist in this index, are indexed, and have
the specified term vector information.
|
abstract TermFreqVector | getTermFreqVector(int docNumber, String field)
Return a term frequency vector for the specified document and field. |
abstract TermFreqVector[] | getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.
|
long | getVersion()
Version number when this IndexReader was opened. |
abstract boolean | hasDeletions() Returns true if any documents have been deleted |
boolean | hasNorms(String field) Returns true if there are norms stored for this field. |
static boolean | indexExists(String directory)
Returns true if an index exists at the specified directory.
|
static boolean | indexExists(File directory)
Returns true if an index exists at the specified directory.
|
static boolean | indexExists(Directory directory)
Returns true if an index exists at the specified directory.
|
boolean | isCurrent()
Check whether this IndexReader still works on a current version of the index.
|
abstract boolean | isDeleted(int n) Returns true if document n has been deleted |
static boolean | isLocked(Directory directory)
Returns true iff the index in the named directory is
currently locked. |
static boolean | isLocked(String directory)
Returns true iff the index in the named directory is
currently locked. |
static long | lastModified(String directory)
Returns the time the index in the named directory was last modified.
|
static long | lastModified(File directory)
Returns the time the index in the named directory was last modified.
|
static long | lastModified(Directory directory)
Returns the time the index in the named directory was last modified.
|
static void | main(String[] args)
Prints the filename and size of each file within a given compound file.
|
abstract int | maxDoc() Returns one greater than the largest possible document number.
|
abstract byte[] | norms(String field) Returns the byte-encoded normalization factor for the named field of
every document. |
abstract void | norms(String field, byte[] bytes, int offset) Reads the byte-encoded normalization factor for the named field of every
document. |
abstract int | numDocs() Returns the number of documents in this index. |
static IndexReader | open(String path) Returns an IndexReader reading the index in an FSDirectory in the named
path. |
static IndexReader | open(File path) Returns an IndexReader reading the index in an FSDirectory in the named
path. |
static IndexReader | open(Directory directory) Returns an IndexReader reading the index in the given Directory. |
void | setNorm(int doc, String field, byte value) Expert: Resets the normalization factor for the named field of the named
document. |
void | setNorm(int doc, String field, float value) Expert: Resets the normalization factor for the named field of the named
document.
|
TermDocs | termDocs(Term term) Returns an enumeration of all the documents which contain
term . |
abstract TermDocs | termDocs() Returns an unpositioned {@link TermDocs} enumerator. |
TermPositions | termPositions(Term term) Returns an enumeration of all the documents which contain
term . |
abstract TermPositions | termPositions() Returns an unpositioned {@link TermPositions} enumerator. |
abstract TermEnum | terms() Returns an enumeration of all the terms in the index.
|
abstract TermEnum | terms(Term t) Returns an enumeration of all terms after a given term.
|
void | undeleteAll() Undeletes all documents currently marked as deleted in this index. |
static void | unlock(Directory directory)
Forcibly unlocks the index in the named directory.
|
Parameters: directory Directory where IndexReader files reside.
Throws: IOException
Deprecated: Use {@link #deleteDocument(int docNum)} instead.
Deletes the document numbereddocNum
. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the {@link #document}
method will result in an error. The presence of this document may still be
reflected in the {@link #docFreq} statistic, though
this will be corrected eventually as the index is further modified.
Deprecated: Use {@link #deleteDocuments(Term term)} instead.
Deletes all documents containingterm
.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See {@link #delete(int)} for information about when this deletion will
become effective.Returns: the number of documents deleted
docNum
. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the {@link #document}
method will result in an error. The presence of this document may still be
reflected in the {@link #docFreq} statistic, though
this will be corrected eventually as the index is further modified.term
.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See {@link #delete(int)} for information about when this deletion will
become effective.Returns: the number of documents deleted
t
.n
th
Document
in this index.docNum
.
Applications should call {@link #delete(int)} or {@link #delete(Term)}.Parameters: directory where the index resides.
Returns: version number.
Throws: IOException if segments file cannot be read
Parameters: directory where the index resides.
Returns: version number.
Throws: IOException if segments file cannot be read
Parameters: directory where the index resides.
Returns: version number.
Throws: IOException if segments file cannot be read.
Deprecated: Replaced by {@link #getFieldNames(IndexReader.FieldOption)}
Returns a list of all unique field names that exist in the index pointed to by this IndexReader.Returns: Collection of Strings indicating the names of the fields
Throws: IOException if there is a problem with accessing the index
Deprecated: Replaced by {@link #getFieldNames(IndexReader.FieldOption)}
Returns a list of all unique field names that exist in the index pointed to by this IndexReader. The boolean argument specifies whether the fields returned are indexed or not.Parameters: indexed true
if only indexed fields should be returned;
false
if only unindexed fields should be returned.
Returns: Collection of Strings indicating the names of the fields
Throws: IOException if there is a problem with accessing the index
Parameters: fldOption specifies which field option should be available for the returned fields
Returns: Collection of Strings indicating the names of the fields.
See Also: FieldOption
Deprecated: Replaced by {@link #getFieldNames(IndexReader.FieldOption)}
Parameters: storedTermVector if true, returns only Indexed fields that have term vector info, else only indexed fields without term vector info
Returns: Collection of Strings indicating the names of the fields
Deprecated: Replaced by {@link #getFieldNames(IndexReader.FieldOption)}
Get a list of unique field names that exist in this index, are indexed, and have the specified term vector information.Parameters: tvSpec specifies which term vector information should be available for the fields
Returns: Collection of Strings indicating the names of the fields
Parameters: docNumber document for which the term frequency vector is returned field field for which the term frequency vector is returned.
Returns: term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Throws: IOException if index cannot be accessed
See Also: TermVector
Parameters: docNumber document for which term frequency vectors are returned
Returns: array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Throws: IOException if index cannot be accessed
See Also: TermVector
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.
false
is returned.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
true
if an index exists at the specified directory.
If the directory does not exist or if there is no index in it.Parameters: directory the directory to check for an index
Returns: true
if an index exists; false
otherwise
Throws: IOException if there is a problem with accessing the index
Throws: IOException
true
iff the index in the named directory is
currently locked.Parameters: directory the directory to check for a lock
Throws: IOException if there is a problem with accessing the index
true
iff the index in the named directory is
currently locked.Parameters: directory the directory to check for a lock
Throws: IOException if there is a problem with accessing the index
Parameters: args Usage: org.apache.lucene.index.IndexReader [-extract] <cfsfile>
See Also: Field
See Also: Field
See Also: norms Similarity
See Also: norms Similarity
term
. For each document, the document number, the frequency of
the term in that document is also provided, for use in search scoring.
Thus, this method implements the mapping:
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
term
. For each document, in addition to the document number
and frequency of the term in that document, a list of all of the ordinal
positions of the term in the document is available. Thus, this method
implements the mapping:
This positional information faciliates phrase and proximity searching.
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this index.