public final class SortingMergePolicy extends MergePolicy
MergePolicy
that reorders documents according to a Sort
before merging them. As a consequence, all segments resulting from a merge
will be sorted while segments resulting from a flush will be in the order
in which documents have been added.
NOTE: Never use this policy if you rely on
IndexWriter.addDocuments
to have sequentially-assigned doc IDs, this policy will scatter doc IDs.
NOTE: This policy should only be used with idempotent Sort
s
so that the order of segments is predictable. For example, using
Sort.INDEXORDER
in reverse (which is not idempotent) will make
the order of documents in a segment depend on the number of times the segment
has been merged.
MergePolicy.DocMap, MergePolicy.MergeAbortedException, MergePolicy.MergeException, MergePolicy.MergeSpecification, MergePolicy.OneMerge
Modifier and Type | Field and Description |
---|---|
static String |
SORTER_ID_PROP
Put in the
diagnostics to denote that
this segment is sorted. |
DEFAULT_MAX_CFS_SEGMENT_SIZE, DEFAULT_NO_CFS_RATIO, maxCFSSegmentSize, noCFSRatio
Constructor and Description |
---|
SortingMergePolicy(MergePolicy in,
Sort sort)
Create a new
MergePolicy that sorts documents with the given sort . |
Modifier and Type | Method and Description |
---|---|
MergePolicy.MergeSpecification |
findForcedDeletesMerges(SegmentInfos segmentInfos,
IndexWriter writer)
Determine what set of merge operations is necessary in order to expunge all
deletes from the index.
|
MergePolicy.MergeSpecification |
findForcedMerges(SegmentInfos segmentInfos,
int maxSegmentCount,
Map<SegmentCommitInfo,Boolean> segmentsToMerge,
IndexWriter writer)
Determine what set of merge operations is necessary in
order to merge to
<= the specified segment count. |
MergePolicy.MergeSpecification |
findMerges(MergeTrigger mergeTrigger,
SegmentInfos segmentInfos,
IndexWriter writer)
Determine what set of merge operations are now necessary on the index.
|
Sort |
getSort()
Return the
Sort order that is used to sort segments when merging. |
static boolean |
isSorted(LeafReader reader,
Sort sort)
Returns
true if the given reader is sorted by the
sort given. |
protected long |
size(SegmentCommitInfo info,
IndexWriter writer)
Return the byte size of the provided
SegmentCommitInfo , pro-rated by percentage of
non-deleted documents is set. |
String |
toString() |
boolean |
useCompoundFile(SegmentInfos segments,
SegmentCommitInfo newSegment,
IndexWriter writer)
Returns true if a new segment (regardless of its origin) should use the
compound file format.
|
getMaxCFSSegmentSizeMB, getNoCFSRatio, isMerged, setMaxCFSSegmentSizeMB, setNoCFSRatio
public static final String SORTER_ID_PROP
diagnostics
to denote that
this segment is sorted.public SortingMergePolicy(MergePolicy in, Sort sort)
MergePolicy
that sorts documents with the given sort
.public static boolean isSorted(LeafReader reader, Sort sort)
true
if the given reader
is sorted by the
sort
given. Typically the given sort
would be the
getSort()
order of a SortingMergePolicy
.public MergePolicy.MergeSpecification findMerges(MergeTrigger mergeTrigger, SegmentInfos segmentInfos, IndexWriter writer) throws IOException
MergePolicy
IndexWriter
calls this whenever there is a change to the segments.
This call is always synchronized on the IndexWriter
instance so
only one thread at a time will call this method.findMerges
in class MergePolicy
mergeTrigger
- the event that triggered the mergesegmentInfos
- the total set of segments in the indexwriter
- the IndexWriter to find the merges onIOException
public MergePolicy.MergeSpecification findForcedMerges(SegmentInfos segmentInfos, int maxSegmentCount, Map<SegmentCommitInfo,Boolean> segmentsToMerge, IndexWriter writer) throws IOException
MergePolicy
<=
the specified segment count. IndexWriter
calls this when its
IndexWriter.forceMerge(int)
method is called. This call is always
synchronized on the IndexWriter
instance so only one thread at a
time will call this method.findForcedMerges
in class MergePolicy
segmentInfos
- the total set of segments in the indexmaxSegmentCount
- requested maximum number of segments in the index (currently this
is always 1)segmentsToMerge
- contains the specific SegmentInfo instances that must be merged
away. This may be a subset of all
SegmentInfos. If the value is True for a
given SegmentInfo, that means this segment was
an original segment present in the
to-be-merged index; else, it was a segment
produced by a cascaded merge.writer
- the IndexWriter to find the merges onIOException
public MergePolicy.MergeSpecification findForcedDeletesMerges(SegmentInfos segmentInfos, IndexWriter writer) throws IOException
MergePolicy
findForcedDeletesMerges
in class MergePolicy
segmentInfos
- the total set of segments in the indexwriter
- the IndexWriter to find the merges onIOException
public boolean useCompoundFile(SegmentInfos segments, SegmentCommitInfo newSegment, IndexWriter writer) throws IOException
MergePolicy
true
iff the size of the given mergedInfo is less or equal to
MergePolicy.getMaxCFSSegmentSizeMB()
and the size is less or equal to the
TotalIndexSize * MergePolicy.getNoCFSRatio()
otherwise false
.useCompoundFile
in class MergePolicy
IOException
protected long size(SegmentCommitInfo info, IndexWriter writer) throws IOException
MergePolicy
SegmentCommitInfo
, pro-rated by percentage of
non-deleted documents is set.size
in class MergePolicy
IOException
Copyright © 2000–2015 The Apache Software Foundation. All rights reserved.