Class TopKSelector<T>


  • @GwtCompatible
    final class TopKSelector<T>
    extends java.lang.Object
    An accumulator that selects the "top" k elements added to it, relative to a provided comparator. "Top" can mean the greatest or the lowest elements, specified in the factory used to create the TopKSelector instance.

    If your input data is available as a Stream, prefer passing Comparators#least(int) to Stream.collect(java.util.stream.Collector). If it is available as an Iterable or Iterator, prefer Ordering.leastOf(Iterable, int).

    This uses the same efficient implementation as Ordering.leastOf(Iterable, int), offering expected O(n + k log k) performance (worst case O(n log k)) for n calls to offer(T) and a call to topK(), with O(k) memory. In comparison, quickselect has the same asymptotics but requires O(n) memory, and a PriorityQueue implementation takes O(n log k). In benchmarks, this implementation performs at least as well as either implementation, and degrades more gracefully for worst-case input.

    The implementation does not necessarily use a stable sorting algorithm; when multiple equivalent elements are added to it, it is undefined which will come first in the output.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private T[] buffer  
      private int bufferSize  
      private java.util.Comparator<? super T> comparator  
      private int k  
      private T threshold
      The largest of the lowest k elements we've seen so far relative to this comparator.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private TopKSelector​(java.util.Comparator<? super T> comparator, int k)  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) TopKSelector<T> combine​(TopKSelector<T> other)  
      static <T extends java.lang.Comparable<? super T>>
      TopKSelector<T>
      greatest​(int k)
      Returns a TopKSelector that collects the greatest k elements added to it, relative to the natural ordering of the elements, and returns them via topK() in descending order.
      static <T> TopKSelector<T> greatest​(int k, java.util.Comparator<? super T> comparator)
      Returns a TopKSelector that collects the greatest k elements added to it, relative to the specified comparator, and returns them via topK() in descending order.
      static <T extends java.lang.Comparable<? super T>>
      TopKSelector<T>
      least​(int k)
      Returns a TopKSelector that collects the lowest k elements added to it, relative to the natural ordering of the elements, and returns them via topK() in ascending order.
      static <T> TopKSelector<T> least​(int k, java.util.Comparator<? super T> comparator)
      Returns a TopKSelector that collects the lowest k elements added to it, relative to the specified comparator, and returns them via topK() in ascending order.
      void offer​(T elem)
      Adds elem as a candidate for the top k elements.
      void offerAll​(java.lang.Iterable<? extends T> elements)
      Adds each member of elements as a candidate for the top k elements.
      void offerAll​(java.util.Iterator<? extends T> elements)
      Adds each member of elements as a candidate for the top k elements.
      private int partition​(int left, int right, int pivotIndex)
      Partitions the contents of buffer in the range [left, right] around the pivot element previously stored in buffer[pivotValue].
      private void swap​(int i, int j)  
      java.util.List<T> topK()
      Returns the top k elements offered to this TopKSelector, or all elements if fewer than k have been offered, in the order specified by the factory used to create this TopKSelector.
      private void trim()
      Quickselects the top k elements from the 2k elements in the buffer.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • k

        private final int k
      • comparator

        private final java.util.Comparator<? super T> comparator
      • buffer

        private final T[] buffer
      • bufferSize

        private int bufferSize
      • threshold

        private T threshold
        The largest of the lowest k elements we've seen so far relative to this comparator. If bufferSize ≥ k, then we can ignore any elements greater than this value.
    • Constructor Detail

      • TopKSelector

        private TopKSelector​(java.util.Comparator<? super T> comparator,
                             int k)
    • Method Detail

      • least

        public static <T extends java.lang.Comparable<? super T>> TopKSelector<T> least​(int k)
        Returns a TopKSelector that collects the lowest k elements added to it, relative to the natural ordering of the elements, and returns them via topK() in ascending order.
        Throws:
        java.lang.IllegalArgumentException - if k < 0 or k > Integer.MAX_VALUE / 2
      • least

        public static <T> TopKSelector<T> least​(int k,
                                                java.util.Comparator<? super T> comparator)
        Returns a TopKSelector that collects the lowest k elements added to it, relative to the specified comparator, and returns them via topK() in ascending order.
        Throws:
        java.lang.IllegalArgumentException - if k < 0 or k > Integer.MAX_VALUE / 2
      • greatest

        public static <T extends java.lang.Comparable<? super T>> TopKSelector<T> greatest​(int k)
        Returns a TopKSelector that collects the greatest k elements added to it, relative to the natural ordering of the elements, and returns them via topK() in descending order.
        Throws:
        java.lang.IllegalArgumentException - if k < 0 or k > Integer.MAX_VALUE / 2
      • greatest

        public static <T> TopKSelector<T> greatest​(int k,
                                                   java.util.Comparator<? super T> comparator)
        Returns a TopKSelector that collects the greatest k elements added to it, relative to the specified comparator, and returns them via topK() in descending order.
        Throws:
        java.lang.IllegalArgumentException - if k < 0 or k > Integer.MAX_VALUE / 2
      • offer

        public void offer​(T elem)
        Adds elem as a candidate for the top k elements. This operation takes amortized O(1) time.
      • trim

        private void trim()
        Quickselects the top k elements from the 2k elements in the buffer. O(k) expected time, O(k log k) worst case.
      • partition

        private int partition​(int left,
                              int right,
                              int pivotIndex)
        Partitions the contents of buffer in the range [left, right] around the pivot element previously stored in buffer[pivotValue]. Returns the new index of the pivot element, pivotNewIndex, so that everything in [left, pivotNewIndex] is ≤ pivotValue and everything in (pivotNewIndex, right] is greater than pivotValue.
      • swap

        private void swap​(int i,
                          int j)
      • offerAll

        public void offerAll​(java.lang.Iterable<? extends T> elements)
        Adds each member of elements as a candidate for the top k elements. This operation takes amortized linear time in the length of elements.

        If all input data to this TopKSelector is in a single Iterable, prefer Ordering.leastOf(Iterable, int), which provides a simpler API for that use case.

      • offerAll

        public void offerAll​(java.util.Iterator<? extends T> elements)
        Adds each member of elements as a candidate for the top k elements. This operation takes amortized linear time in the length of elements. The iterator is consumed after this operation completes.

        If all input data to this TopKSelector is in a single Iterator, prefer Ordering.leastOf(Iterator, int), which provides a simpler API for that use case.

      • topK

        public java.util.List<T> topK()
        Returns the top k elements offered to this TopKSelector, or all elements if fewer than k have been offered, in the order specified by the factory used to create this TopKSelector.

        The returned list is an unmodifiable copy and will not be affected by further changes to this TopKSelector. This method returns in O(k log k) time.