public class LRUQueryCache extends java.lang.Object implements QueryCache, Accountable
QueryCache
that evicts queries using a LRU (least-recently-used)
eviction policy in order to remain under a given maximum size and number of
bytes used.
This class is thread-safe.
Note that query eviction runs in linear time with the total number of
segments that have cache entries so this cache works best with
caching policies
that only cache on "large"
segments, and it is advised to not share this cache across too many indices.
A default query cache and policy instance is used in IndexSearcher. If you want to replace those defaults
it is typically done like this:
final int maxNumberOfCachedQueries = 256; final long maxRamBytesUsed = 50 * 1024L * 1024L; // 50MB // these cache and policy instances can be shared across several queries and readers // it is fine to eg. store them into static variables final QueryCache queryCache = new LRUQueryCache(maxNumberOfCachedQueries, maxRamBytesUsed); final QueryCachingPolicy defaultCachingPolicy = new UsageTrackingQueryCachingPolicy(); indexSearcher.setQueryCache(queryCache); indexSearcher.setQueryCachingPolicy(defaultCachingPolicy);This cache exposes some global statistics (
hit count
,
miss count
, number of cache
entries
, total number of DocIdSets that have ever
been cached
, number of evicted entries
). In
case you would like to have more fine-grained statistics, such as per-index
or per-query-class statistics, it is possible to override various callbacks:
onHit(java.lang.Object, org.apache.lucene.search.Query)
, onMiss(java.lang.Object, org.apache.lucene.search.Query)
,
onQueryCache(org.apache.lucene.search.Query, long)
, onQueryEviction(org.apache.lucene.search.Query, long)
,
onDocIdSetCache(java.lang.Object, long)
, onDocIdSetEviction(java.lang.Object, int, long)
and onClear()
.
It is better to not perform heavy computations in these methods though since
they are called synchronously and under a lock.QueryCachingPolicy
Modifier and Type | Class and Description |
---|---|
private class |
LRUQueryCache.CachingWrapperWeight |
private class |
LRUQueryCache.LeafCache |
(package private) static class |
LRUQueryCache.MinSegmentSizePredicate |
Modifier and Type | Field and Description |
---|---|
private java.util.Map<IndexReader.CacheKey,LRUQueryCache.LeafCache> |
cache |
private long |
cacheCount |
private long |
cacheSize |
(package private) static long |
HASHTABLE_RAM_BYTES_PER_ENTRY |
private long |
hitCount |
private java.util.function.Predicate<LeafReaderContext> |
leavesToCache |
(package private) static long |
LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY |
private java.util.concurrent.locks.ReentrantLock |
lock |
private long |
maxRamBytesUsed |
private int |
maxSize |
private long |
missCount |
private java.util.Set<Query> |
mostRecentlyUsedQueries |
(package private) static long |
QUERY_DEFAULT_RAM_BYTES_USED |
private long |
ramBytesUsed |
private java.util.Map<Query,Query> |
uniqueQueries |
Constructor and Description |
---|
LRUQueryCache(int maxSize,
long maxRamBytesUsed)
Create a new instance that will cache at most
maxSize queries
with at most maxRamBytesUsed bytes of memory. |
LRUQueryCache(int maxSize,
long maxRamBytesUsed,
java.util.function.Predicate<LeafReaderContext> leavesToCache)
Expert: Create a new instance that will cache at most
maxSize
queries with at most maxRamBytesUsed bytes of memory, only on
leaves that satisfy leavesToCache . |
Modifier and Type | Method and Description |
---|---|
(package private) void |
assertConsistent() |
(package private) java.util.List<Query> |
cachedQueries() |
protected DocIdSet |
cacheImpl(BulkScorer scorer,
int maxDoc)
Default cache implementation: uses
RoaringDocIdSet for sets that
have a density < 1% and a BitDocIdSet over a FixedBitSet
otherwise. |
private static DocIdSet |
cacheIntoBitSet(BulkScorer scorer,
int maxDoc) |
private static DocIdSet |
cacheIntoRoaringDocIdSet(BulkScorer scorer,
int maxDoc) |
void |
clear()
Clear the content of this cache.
|
void |
clearCoreCacheKey(java.lang.Object coreKey)
Remove all cache entries for the given core cache key.
|
void |
clearQuery(Query query)
Remove all cache entries for the given query.
|
Weight |
doCache(Weight weight,
QueryCachingPolicy policy)
Return a wrapper around the provided
weight that will cache
matching docs per-segment accordingly to the given policy . |
(package private) void |
evictIfNecessary() |
(package private) DocIdSet |
get(Query key,
LeafReaderContext context,
IndexReader.CacheHelper cacheHelper) |
long |
getCacheCount()
Return the total number of cache entries that have been generated and put
in the cache.
|
long |
getCacheSize()
Return the total number of
DocIdSet s which are currently stored
in the cache. |
java.util.Collection<Accountable> |
getChildResources()
Returns nested resources of this class.
|
long |
getEvictionCount()
Return the number of cache entries that have been removed from the cache
either in order to stay under the maximum configured size/ram usage, or
because a segment has been closed.
|
long |
getHitCount()
|
long |
getMissCount()
Over the
total number of times that a query has
been looked up, return how many times this query was not contained in the
cache. |
long |
getTotalCount()
Return the total number of times that a
Query has been looked up
in this QueryCache . |
protected void |
onClear()
Expert: callback when the cache is completely cleared.
|
protected void |
onDocIdSetCache(java.lang.Object readerCoreKey,
long ramBytesUsed)
Expert: callback when a
DocIdSet is added to this cache. |
protected void |
onDocIdSetEviction(java.lang.Object readerCoreKey,
int numEntries,
long sumRamBytesUsed)
Expert: callback when one or more
DocIdSet s are removed from this
cache. |
private void |
onEviction(Query singleton) |
protected void |
onHit(java.lang.Object readerCoreKey,
Query query)
Expert: callback when there is a cache hit on a given query.
|
protected void |
onMiss(java.lang.Object readerCoreKey,
Query query)
Expert: callback when there is a cache miss on a given query.
|
protected void |
onQueryCache(Query query,
long ramBytesUsed)
Expert: callback when a query is added to this cache.
|
protected void |
onQueryEviction(Query query,
long ramBytesUsed)
Expert: callback when a query is evicted from this cache.
|
(package private) void |
putIfAbsent(Query query,
LeafReaderContext context,
DocIdSet set,
IndexReader.CacheHelper cacheHelper) |
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
(package private) boolean |
requiresEviction()
Whether evictions are required.
|
static final long QUERY_DEFAULT_RAM_BYTES_USED
static final long HASHTABLE_RAM_BYTES_PER_ENTRY
static final long LINKED_HASHTABLE_RAM_BYTES_PER_ENTRY
private final int maxSize
private final long maxRamBytesUsed
private final java.util.function.Predicate<LeafReaderContext> leavesToCache
private final java.util.Set<Query> mostRecentlyUsedQueries
private final java.util.Map<IndexReader.CacheKey,LRUQueryCache.LeafCache> cache
private final java.util.concurrent.locks.ReentrantLock lock
private volatile long ramBytesUsed
private volatile long hitCount
private volatile long missCount
private volatile long cacheCount
private volatile long cacheSize
public LRUQueryCache(int maxSize, long maxRamBytesUsed, java.util.function.Predicate<LeafReaderContext> leavesToCache)
maxSize
queries with at most maxRamBytesUsed
bytes of memory, only on
leaves that satisfy leavesToCache
.public LRUQueryCache(int maxSize, long maxRamBytesUsed)
maxSize
queries
with at most maxRamBytesUsed
bytes of memory. Queries will
only be cached on leaves that have more than 10k documents and have more
than 3% of the total number of documents in the index.
This should guarantee that all leaves from the upper
tier
will be cached while ensuring that at most
33 leaves can make it to the cache (very likely less than 10 in
practice), which is useful for this implementation since some operations
perform in linear time with the number of cached leaves.
Only clauses whose cost is at most 100x the cost of the top-level query will
be cached in order to not hurt latency too much because of caching.protected void onHit(java.lang.Object readerCoreKey, Query query)
protected void onMiss(java.lang.Object readerCoreKey, Query query)
protected void onQueryCache(Query query, long ramBytesUsed)
protected void onQueryEviction(Query query, long ramBytesUsed)
protected void onDocIdSetCache(java.lang.Object readerCoreKey, long ramBytesUsed)
DocIdSet
is added to this cache.
Implementing this method is typically useful in order to compute more
fine-grained statistics about the query cache.protected void onDocIdSetEviction(java.lang.Object readerCoreKey, int numEntries, long sumRamBytesUsed)
DocIdSet
s are removed from this
cache.onDocIdSetCache(java.lang.Object, long)
protected void onClear()
boolean requiresEviction()
DocIdSet get(Query key, LeafReaderContext context, IndexReader.CacheHelper cacheHelper)
void putIfAbsent(Query query, LeafReaderContext context, DocIdSet set, IndexReader.CacheHelper cacheHelper)
void evictIfNecessary()
public void clearCoreCacheKey(java.lang.Object coreKey)
public void clearQuery(Query query)
private void onEviction(Query singleton)
public void clear()
void assertConsistent()
java.util.List<Query> cachedQueries()
public Weight doCache(Weight weight, QueryCachingPolicy policy)
QueryCache
weight
that will cache
matching docs per-segment accordingly to the given policy
.
NOTE: The returned weight will only be equivalent if scores are not needed.doCache
in interface QueryCache
Collector.scoreMode()
public long ramBytesUsed()
Accountable
ramBytesUsed
in interface Accountable
public java.util.Collection<Accountable> getChildResources()
Accountable
getChildResources
in interface Accountable
Accountables
protected DocIdSet cacheImpl(BulkScorer scorer, int maxDoc) throws java.io.IOException
RoaringDocIdSet
for sets that
have a density < 1% and a BitDocIdSet
over a FixedBitSet
otherwise.java.io.IOException
private static DocIdSet cacheIntoBitSet(BulkScorer scorer, int maxDoc) throws java.io.IOException
java.io.IOException
private static DocIdSet cacheIntoRoaringDocIdSet(BulkScorer scorer, int maxDoc) throws java.io.IOException
java.io.IOException
public final long getTotalCount()
Query
has been looked up
in this QueryCache
. Note that this number is incremented once per
segment so running a cached query only once will increment this counter
by the number of segments that are wrapped by the searcher.
Note that by definition, getTotalCount()
is the sum of
getHitCount()
and getMissCount()
.getHitCount()
,
getMissCount()
public final long getHitCount()
total
number of times that a query has
been looked up, return how many times a cached DocIdSet
has been
found and returned.getTotalCount()
,
getMissCount()
public final long getMissCount()
total
number of times that a query has
been looked up, return how many times this query was not contained in the
cache.getTotalCount()
,
getHitCount()
public final long getCacheSize()
DocIdSet
s which are currently stored
in the cache.getCacheCount()
,
getEvictionCount()
public final long getCacheCount()
hit
count
that is much higher than the cache count
as the opposite would indicate that the query cache makes efforts in order
to cache queries but then they do not get reused.getCacheSize()
,
getEvictionCount()
public final long getEvictionCount()
caching policy
caches too aggressively on NRT segments which get merged
early.getCacheCount()
,
getCacheSize()