Difference between revisions of "User:Duyn/kd-tree Tutorial"

From Robowiki
Jump to navigation Jump to search
(→‎See also: Added some source links)
(→‎Searching: Update method call for searchTree to include max distance, get rid of (unnecessary) spaghetti code)
 
(49 intermediate revisions by 2 users not shown)
Line 1: Line 1:
( Walkthrough: An Optimised kd-tree. This is the beginning of a walk through on writing a k-d tree )
+
[Work-in-Progress]
  
This page will walk you through the implementation of an optimised kd tree. We will end up with a bucket kd-tree which splits on the mean of the dimension with largest variance. Writing a kd-tree is non-trivial; having some details explained can make the process easier. There are already other k-d trees on this wiki, see some of the others for more ideas.
+
This page will walk you through the implementation of a ''k''d-tree. We start with a basic tree and progressively add levels of optimisation. Writing a fully optimised ''k''d-tree is an advanced task, but starting with the basics helps. There are already other ''k''d-trees on this wiki, see some of the others for ideas.
  
 
==Theory==
 
==Theory==
 
A k-d tree is a binary tree which successively splits a ''k''-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A ''k''-nearest neighbour search can be implemented from a nearest neighbour algorithm by not shrinking the search radius until ''k'' items have been found. This page won't make a distinction between the two.
 
A k-d tree is a binary tree which successively splits a ''k''-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A ''k''-nearest neighbour search can be implemented from a nearest neighbour algorithm by not shrinking the search radius until ''k'' items have been found. This page won't make a distinction between the two.
 
Some of the non-obvious optimisations used in this tree are:
 
* '''Path ordering'''. Our search will descend the tree with the edge closest to the query point first. This is heuristically more likely to contain closer neighbours, which will lead to contracting our search radius sooner. It also ensures we start searching from the leaf which would hold the query point if it were added. Combined with a good bucket size, this lets us quickly load up some decent candidates before we have even examined any other branches.
 
* '''Bounds-overlap-ball testing'''. Each tree stores the bounds of the smallest hyperrect which will contain all the data in that tree. This is smaller—maybe significantly so—than the bounds which that sub-tree occupy.
 
* '''Splitting on mean'''. While most descriptions of kd-trees describe splitting on the median, we will split on the mean. This is faster for us because we will have already calculated the mean in finding the dimension with largest variance.
 
  
 
==An Exemplar class==
 
==An Exemplar class==
We will call each item in our k-d tree an '''exemplar'''. Each exemplar has a '''domain'''—its spatial co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.
+
We will call each item in our k-d tree an '''exemplar'''. Each exemplar has a '''domain'''—its co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.
  
You might already have a class somewhere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.
+
We might already have a class elsewere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.
  
<pre>
+
public class Exemplar {
public class Exemplar {
+
  public final double[] domain;
  public final double[] domain;
+
 
+
  public Exemplar(double[] domain) {
  public Exemplar(final double[] coords) {
+
    this.domain = domain;
    this.domain = coords;
+
  }
  }
+
 
+
  public final boolean
  // Short hand. Shorter than calling Arrays.equals() each time.
+
  collocated(final Exemplar other) {
  public boolean domainEquals(final Exemplar other) {
+
    return Arrays.equals(domain, other.domain);
    return Arrays.equals(domain, other.domain);
+
  }
  }
+
}
}
 
</pre>
 
  
While this class is fully usable as is, rarely will you be interested in just the domain of nearest neighbours in a search. It is expected that specific data (eg. guess factors) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.
+
While this class is fully usable as is, rarely will the domain of nearest neighbours be of any interest. Often, useful data (such as [[GuessFactor]]s) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.
  
 
==Basic Tree==
 
==Basic Tree==
Here is a basic tree structure:
+
First, we will build a basic ''k''d-tree as described by Wikipedia's [http://en.wikipedia.org/w/index.php?title=Kd-tree&oldid=346156055 kd-tree] page. We start off with a standard binary tree. Each tree is either a '''node''' with both left and right sub-trees defined, or a '''leaf''' carrying an exemplar. For simplicity, we won't allow nodes to contain any exemplars.
  
<pre>
+
public class BasicKdTree<X extends Exemplar> {
public class BucketKdTree<T extends Exemplar> {
+
  // Basic tree structure
  private List<T> exemplars = new LinkedList<T>();
+
  X data = null;
  private BucketKdTree<T> left, right;
+
  BasicKdTree<X> left = null, right = null;
  private int bucketSize;
+
  private final int dimensions;
+
  // Only need to test one branch since we always populate both
 +
  // branches at once
 +
  private boolean
 +
  isTree() { return left != null; }
 +
 +
  ...
 +
}
  
  public BucketKdTree(int bucketSize, int dimensions) {
+
===Adding===
    this.bucketSize = bucketSize;
+
Each of the public API functions defers to a private static method. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.
    this.dimensions = dimensions;
 
  }
 
  
  private boolean
+
public void
  isTree() { return left != null; }
+
add(X ex) {
}
+
  BasicKdTree.addToTree(this, ex);
</pre>
+
}
  
Each tree is either a tree with both left and right sub-trees defined, or a leaf with exemplars filled. Because of our splitting algorithm, it is pointless to allow a tree to be both since the mean might not correspond with any actual exemplars.
+
To add an exemplar, we traverse the tree from top down until we find a leaf. If the leaf doesn't already have an exemplar, we can just stow it there. Otherwise, we split the leaf and put the free exemplars in each sub-tree.
  
Bucket size and dimensions must be passed into the constructor. We could infer dimension from the dimension of the first point added, but this is simpler. Bucket size is not final because theoretically it could be varied, though our implementation will not.
+
To split a leaf, we cycle through split dimensions in order as in most descriptions of ''k''d-trees. Because we only allow each leaf to hold a single exemplar, we can use a simple splitting strategy: smallest on the left, largest on the right. The split value is half way between the two points along the split dimension.
 +
 +
int splitDim = 0;
 +
double split = Double.NaN;
 +
 +
private static <X extends Exemplar> void
 +
addToTree(BasicKdTree<X> tree, X ex) {
 +
  while(tree != null) {
 +
    if (tree.isTree()) {
 +
      // Traverse in search of a leaf
 +
      tree = ex.domain[tree.splitDim] <= tree.split
 +
        ? tree.left : tree.right;
 +
    } else {
 +
      if (tree.data == null) {
 +
        tree.data = ex;
 +
      } else {
 +
        // Split tree and add
 +
        // Find smallest exemplar to be our split point
 +
        final int d = tree.splitDim;
 +
        X leftX = ex, rightX = tree.data;
 +
        if (rightX.domain[d] < leftX.domain[d]) {
 +
          leftX = tree.data;
 +
          rightX = ex;
 +
        }
 +
        tree.split = 0.5*(leftX.domain[d] + rightX.domain[d]);
 +
       
 +
        final int nextSplitDim =
 +
          (tree.splitDim + 1)%tree.dimensions();
 +
       
 +
        tree.left = new BasicKdTree<X>();
 +
        tree.left.splitDim = nextSplitDim;
 +
        tree.left.data = leftX;
 +
       
 +
        tree.right = new BasicKdTree<X>();
 +
        tree.right.splitDim = nextSplitDim;
 +
        tree.right.data = rightX;
 +
      }
 +
 +
      // Done.
 +
      tree = null;
 +
    }
 +
  }
 +
}
 +
 +
private int
 +
dimensions() { return data.domain.length; }
  
==Adding==
+
===Searching===
 +
Before we start coding a search method, we need a helper class to store search results along with their distance from the query point. This is called <code>PrioNode</code> because we will eventually re-use it to implement a custom priority queue.
  
Each of the public API functions defers the actual addition to another private static function. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.
+
public final class PrioNode<T> {
 +
  public final double priority;
 +
  public final T data;
 +
 +
  public PrioNode(double priority, T data) {
 +
    this.priority = priority;
 +
    this.data = data;
 +
  }
 +
}
  
We decide whether to split a leaf only after the add has been completed.
+
Like with our <code>add()</code> method, our <code>search()</code> method delegates to a static method to avoid introducing bugs by accidentally referring to member variables while we descend the tree.
  
<pre>
+
public Iterable<? extends PrioNode<X>>
// One at a time
+
search(double[] query, int nResults) {
public void
+
  return BasicKdTree.search(this, query, nResults);
add(T ex) {
+
}
  BucketKdTree<T> tree = addNoSplit(this, ex);
 
  if (shouldSplit(tree)) {
 
    split(tree);
 
  }
 
}
 
  
// Bulk add gives us more data to choose a better split point
+
To do a nearest neighbours search, we walk the tree, preferring to search sub-trees which are on the same side of the split as the query point first. Once we have found our initial candidates, we can contract our search sphere. We only search the other sub-tree if our search sphere might spill over onto the other side of the split.
public void
 
addAll(Collection<T> exs) {
 
  // Some spurious function calls. Optimised for readability over
 
  // efficiency.
 
  final Set<BucketKdTree<T>> modTrees =
 
    new HashSet<BucketKdTree<T>>();
 
  for(T ex : exs) {
 
    modTrees.add(addNoSplit(this, ex));
 
  }
 
  
  for(BucketKdTree<T> tree : modTrees) {
+
Results are collected in a <code>java.util.PriorityQueue</code> so we can easily remove the farthest exemplars as we find closer ones.
    if (shouldSplit(tree)) {
 
      split(tree);
 
    }
 
  }
 
}
 
</pre>
 
  
To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf.
+
private static <X extends Exemplar> Iterable<? extends PrioNode<X>>
 +
search(BasicKdTree<X> tree, double[] query, int nResults) {
 +
  final Queue<PrioNode<X>> results =
 +
    new PriorityQueue<PrioNode<X>>(nResults,
 +
      new Comparator<PrioNode<X>>() {
 +
 +
        // min-heap
 +
        public int
 +
        compare(PrioNode<X> o1, PrioNode<X> o2) {
 +
          return o1.priority == o2.priority ? 0
 +
            : o1.priority > o2.priority ? -1
 +
            : 1;
 +
        }
 +
 +
      }
 +
    );
 +
  final Deque<BasicKdTree<X>> stack
 +
    = new LinkedList<BasicKdTree<X>>();
 +
  stack.addLast(tree);
 +
  while (!stack.isEmpty()) {
 +
    tree = stack.removeLast();
 +
   
 +
    if (tree.isTree()) {
 +
      // Guess nearest tree to query point
 +
      BasicKdTree<X> nearTree = tree.left, farTree = tree.right;
 +
      if (query[tree.splitDim] > tree.split) {
 +
        nearTree = tree.right;
 +
        farTree = tree.left;
 +
      }
 +
     
 +
      // Only search far tree if our search sphere might
 +
      // overlap with splitting plane
 +
      if (results.size() < nResults
 +
        || sq(query[tree.splitDim] - tree.split)
 +
          <= results.peek().priority)
 +
      {
 +
        stack.addLast(farTree);
 +
      }
 +
 +
      // Always search the nearest branch
 +
      stack.addLast(nearTree);
 +
    } else {
 +
      final double dSq = distanceSqFrom(query, tree.data.domain);
 +
      if (results.size() < nResults
 +
        || dSq < results.peek().priority)
 +
      {
 +
        while (results.size() >= nResults) {
 +
          results.poll();
 +
        }
 +
 +
        results.offer(new PrioNode<X>(dSq, tree.data));
 +
      }
 +
    }
 +
  }
 +
  return results;
 +
}
 +
 
 +
private static double
 +
sq(double n) { return n*n; }
  
To decide which sub-tree to traverse, each tree stores two values&mdash;a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.
+
Our distance calculation is optimised because it will be called often. We use squared distances to avoid an unnecessary <code>sqrt()</code> operation. We also don't use <code>Math.pow()</code> because the JRE's default implementation must do some extra work before it can return with a simple multiplication (see the source in [http://www.netlib.org/fdlibm/e_pow.c fdlibm]).
  
Since adding takes little time compared to searching, we take this opportunity to make some optimisations:
+
private static double
 +
distanceSqFrom(double[] p1, double[] p2) {
 +
  // Note: profiling shows this is called lots of times, so it pays
 +
  // to be well optimised
 +
  double dSq = 0;
 +
  for(int d = 0; d < p1.length; d++) {
 +
    final double dst = p1[d] - p2[d];
 +
    if (dst != 0)
 +
      dSq += dst*dst;
 +
  }
 +
  return dSq;
 +
}
  
* We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it doesn't actually contain any points within the hyperrect bounding our search sphere. This hyperrect is defined by <code>maxBounds</code> and <code>minBounds</code>.
+
And that's it. Barely 200 lines of code and we have ourselves a working ''k''d-tree.
  
* To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using [http://www.johndcook.com/standard_deviation.html Welford's method]:
+
===Evaluation===
    <math>
+
As a guide to performance, we will use the [[k-NN algorithm benchmark]] with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data]. For comparison, the benchmark included an optimised linear search to represent the lower bound of performance, and Rednaxela's tree&mdash;credited to be the fastest tree on the wiki. Full results are near the bottom of the page. Here are the relevant extracts for our tree:
    M_1 = x_1,\qquad S_1 = 0</math>
 
    <math>
 
    M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
 
    <math>
 
    S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
 
  
* We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
 +
: Average searching time      = 0.5740 miliseconds
 +
: Average worst searching time = 2.3482 miliseconds
 +
: Average adding time          = 0.4528 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
[...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
  
The final adding code:
+
This test run showed that average add time was close to a linear search while searching performance improved by (0.3939/0.5740 - 1) ~= 31%. It's still an order of magnitude slower than Rednaxela's included tree, which shows there is a lot to gain by selecting appropriate optimisations.
<pre>
 
private int splitDim;
 
private double split;
 
  
// These aren't initialised until add() is called.
+
The following code was used in the benchmark:
private double[] exMean;
 
private double[] exSumSqDev;
 
  
// Optimisation when tree contains large number of duplicates
+
* KNNBenchmark.java
private boolean exemplarsAreUniform = true;
 
  
// Optimisation for searches. This lets us skip a node if its
+
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
// scope intersects with a search hypersphere but it doesn't contain
+
  final Class<?>[] searchAlgorithms = new Class<?>[] {
// any points that actually intersect.
+
    FlatKNNSearch.class,
private double[] maxBounds;
+
    SimontonTreeKNNSearch.class,
private double[] minBounds;
+
    NatTreeKNNSearch.class,
 +
    VoidiousTreeKNNSearch.class,
 +
+  DuynBasicKNNSearch.class,
 +
    RednaxelaTreeKNNSearch.class,
 +
  };
 +
  ...
 +
}
 +
 
 +
* DuynBasicKNNSearch.java
  
// Adds an exemplar without splitting overflowing leaves.
+
public class DuynBasicKNNSearch extends KNNImplementation {
// Returns leaf to which exemplar was added.
+
  final BasicKdTree<StringExemplar> tree;
private static <T extends Exemplar> BucketKdTree<T>
+
  public DuynBasicKNNSearch(final int dimension) {
addNoSplit(BucketKdTree<T> tree, T ex) {
+
    super(dimension);
  // Some spurious function calls. Optimised for readability over
+
    tree = new BasicKdTree<StringExemplar>();
  // efficiency.
+
  }
  BucketKdTree<T> cursor = tree;
+
  while (cursor != null) {
+
  @Override public void
    updateBounds(cursor, ex);
+
  addPoint(final double[] location, final String value) {
    if (cursor.isTree()) {
+
    tree.add(new StringExemplar(location, value));
      // Sub-tree
+
  }
      cursor = ex.domain[cursor.splitDim] <= cursor.split
+
        ? cursor.left : cursor.right;
+
  @Override public String
    } else {
+
  getName() {
      // Leaf
+
    return "duyn's basic kd-tree";
      cursor.exemplars.add(ex);
+
  }
      final int nExs = cursor.exemplars.size();
+
      if (nExs == 1) {
+
  @Override public KNNPoint[]
        cursor.exMean =
+
  getNearestNeighbors(final double[] location, final int size) {
          Arrays.copyOf(ex.domain, cursor.dimensions);
+
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
        cursor.exSumSqDev = new double[cursor.dimensions];
+
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      } else {
+
      final double distance = sr.priority;
        for(int d = 0; d < cursor.dimensions; d++) {
+
      final StringExemplar pt = sr.data;
          final double coord = ex.domain[d];
+
      justPoints.add(new KNNPoint(pt.value, distance));
 +
    }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
    return justPoints.toArray(retVal);
 +
  }
 +
 +
  class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
  }
 +
}
  
          final double oldExMean = cursor.exMean[d];
+
==Optimisations to the Tree==
          final double newMean = cursor.exMean[d] =
+
In this section, we will introduce the following optimisations to our tree:
            oldExMean + (coord - oldExMean)/nExs;
+
* '''Bucket leaves.''' Leaves will store multiple exemplars, only being split when they overflow. This makes our split points less dependent on the order in which we receive points and gives us more opportunity to choose a better split point.
 +
* '''Bounds-overlap-ball testing.''' Each tree stores the bounds of the smallest hyperrect which will contain all the data in that tree. This is smaller&mdash;maybe significantly so&mdash;than the bounds which that sub-tree occupy, giving us more opportunities to skip sub-trees. We will check that the minimum distance from each tree's content bounds overlaps with our search hypersphere. This lets us avoid walking all the way to a leaf if a tree's content bounds overlap with our search hypersphere but none of its children's content bounds do.
 +
* '''Splitting choice.''' We will split on the mean of the dimension with the largest variance. Splitting on the mean does not guarantee that our tree will be balanced, but is easier for us since we would have already calculated the mean in finding the dimension with largest variance. This will add more overhead to our <code>add</code> times since we will compute running means and variances each time an exemplar is added.
 +
* '''Singularity tracking.''' We will actively check whether all the exemplars in a leaf have the same domain. This is necessary to avoid getting ourselves caught in an infinite loop trying to repeatedly split an unsplittable leaf. It also lets us avoid repeated distance calculations when searching for nearest neighbours. Practically, unless all your dimensions are limited, discrete values, this optimisation is unlikely to make much impact.
  
          final double oldSumSqDev = cursor.exSumSqDev[d];
+
===Basic Tree===
          cursor.exSumSqDev[d] = oldSumSqDev
+
Once again, we start off with a basic binary tree. Each tree must store its own bucket size since it will not have a reference to its parent Bucket size is not static because we might want to build multiple trees with different bucket sizes.
            + (coord - oldExMean)*(coord - newMean);
 
        }
 
      }
 
      if (cursor.exemplarsAreUniform) {
 
        final List<T> cExs = cursor.exemplars;
 
        if (cExs.size() > 0 && !ex.domainEquals(cExs.get(0)))
 
          cursor.exemplarsAreUniform = false;
 
      }
 
      return cursor;
 
    }
 
  }
 
  throw new RuntimeException("Walked tree without adding anything");
 
}
 
  
private static <T extends Exemplar> void
+
public final class OptKdTree<X extends Exemplar> {
updateBounds(BucketKdTree<T> tree, Exemplar ex) {
+
  final Queue<X> data;
  final int dims = tree.dimensions;
+
  OptKdTree<X> left = null, right = null;
  if (tree.maxBounds == null) {
+
    tree.maxBounds = Arrays.copyOf(ex.domain, dims);
+
  // Number of exemplars to hold in a leaf before splitting
    tree.minBounds = Arrays.copyOf(ex.domain, dims);
+
  private final int bucketSize;
  } else {
+
  private static final int DEFAULT_BUCKET_SIZE = 10;
    for(int d = 0; d < dims; d++) {
+
      final double dimVal = ex.domain[d];
+
  public OptKdTree() {
      if (dimVal > tree.maxBounds[d])
+
    this(DEFAULT_BUCKET_SIZE);
        tree.maxBounds[d] = dimVal;
+
  }
      else if (dimVal < tree.minBounds[d])
+
        tree.minBounds[d] = dimVal;
+
  public OptKdTree(int bucketSize) {
    }
+
    this.bucketSize = bucketSize;
  }
+
    this.data = new LinkedList<X>();
}
+
  }
</pre>
+
 +
  private final boolean
 +
  isTree() { return left != null; }
 +
 +
  [...]
 +
}
  
==Splitting==
+
===Adding===
We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.
+
Now that our code is a little more sophisticated, it becomes profitable to have a dedicated method to bulk-load our tree. This also makes it easier to add dynamic rebalance support later if we need it. To take advantage the additional information we get from bulk loading, we decide whether to split a leaf only after all exemplars have been added.
  
<pre>
+
public void
private static <T extends Exemplar> boolean
+
add(X ex) {
shouldSplit(BucketKdTree<T> tree) {
+
  OptKdTree<X> tree = addNoSplit(this, ex);
  return tree.exemplars.size() > tree.bucketSize
+
  if (shouldSplit(tree)) {
    && !tree.exemplarsAreUniform;
+
    split(tree);
}
+
  }
</pre>
+
}
 +
 +
public void
 +
addAll(Collection<X> exs) {
 +
  final Set<OptKdTree<X>> modTrees =
 +
    new HashSet<OptKdTree<X>>();
 +
  for(X ex : exs) {
 +
    modTrees.add(addNoSplit(this, ex));
 +
  }
 +
 +
  for(OptKdTree<X> tree : modTrees) {
 +
    if (shouldSplit(tree)) {
 +
      split(tree);
 +
    }
 +
  }
 +
}
  
Thanks to our pre-computation, splitting is straight-forward&mdash;most of the time. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension.
+
To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf. For now, there is no special ordering to the list of exemplars in a leaf. To decide which sub-tree to traverse, each tree stores two values&mdash;a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.
  
To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. If all the exemplars end up in only one of the lists, then our split point has failed to actually separate our exemplars. This is most likely due to rounding error when our exemplars are really close together. At a loss for what to do, we simply pick a random point and a random dimension until we find something that parts our exemplars. We know we must find one eventually because our exemplars are not uniform&mdash;at least one of them is smaller in at least one dimension than all the others.
+
Since adding takes little time compared to searching, we take this opportunity to make some optimisations:
  
Finally, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.
+
* We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it has no chance of containing any points within our search sphere. This hyperrect is fully defined by just storing its two extreme corners.
  
<pre>
+
* To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using [http://www.johndcook.com/standard_deviation.html Welford's method]:
@SuppressWarnings("unchecked") private static <T extends Exemplar> void
+
    <math>
split(BucketKdTree<T> tree) {
+
    M_1 = x_1,\qquad S_1 = 0</math>
  assert !tree.exemplarsAreUniform;
+
    <math>
  // Find dimension with largest variance to split on
+
    M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
  double largestVar = -1;
+
     <math>
  int splitDim = 0;
+
    S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
  for(int d = 0; d < tree.dimensions; d++) {
 
    final double var =
 
      tree.exSumSqDev[d]/(tree.exemplars.size() - 1);
 
     if (var > largestVar) {
 
      largestVar = var;
 
      splitDim = d;
 
    }
 
  }
 
  
  // Find mean as position for our split
+
* We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
  double splitValue = tree.exMean[splitDim];
 
  
  // Check that our split actually splits our data. This also lets
+
// These aren't initialised until add() is called.
  // us bulk load exemplars into sub-trees, which is more likely
+
private double[] exMean = null, exSumSqDev = null;
  // to keep optimal balance.
+
  final List<T> leftExs = new LinkedList<T>();
+
// Optimisation when sub-tree contains only duplicates
  final List<T> rightExs = new LinkedList<T>();
+
private boolean singularity = true;
  for(T s : tree.exemplars) {
+
    if (s.domain[splitDim] <= splitValue)
+
// Optimisation for searches. This lets us skip a node if its
      leftExs.add(s);
+
// scope intersects with a search hypersphere but it doesn't contain
    else
+
// any points that actually intersect.
      rightExs.add(s);
+
private double[] contentMax = null, contentMin = null;
  }
+
  int leftSize = leftExs.size();
+
private int
  final int treeSize = tree.exemplars.size();
+
dimensions() { return contentMax.length; }
  if (leftSize == treeSize || leftSize == 0) {
+
    System.err.println(
+
// Addition
      "WARNING: Randomly splitting non-uniform tree");
+
    // We know the exemplars aren't all the same, so try picking
+
// Adds an exemplar without splitting overflowing leaves.
    // an exemplar and a dimension at random for our split point
+
// Returns leaf to which exemplar was added.
 +
private static <X extends Exemplar> OptKdTree<X>
 +
addNoSplit(OptKdTree<X> tree, X ex) {
 +
  // Some spurious function calls. Optimised for readability over
 +
  // efficiency.
 +
  OptKdTree<X> cursor = tree;
 +
  while (cursor != null) {
 +
    updateBounds(cursor, ex);
 +
    if (cursor.isTree()) {
 +
      // Sub-tree
 +
      cursor = ex.domain[cursor.splitDim] <= cursor.split
 +
        ? cursor.left : cursor.right;
 +
    } else {
 +
      // Leaf
 +
 +
      // Add exemplar to leaf
 +
      cursor.data.add(ex);
 +
 +
      // Calculate running mean and sum of squared deviations
 +
      final int nExs = cursor.data.size();
 +
      final int dims = cursor.dimensions();
 +
      if (nExs == 1) {
 +
        cursor.exMean = Arrays.copyOf(ex.domain, dims);
 +
        cursor.exSumSqDev = new double[dims];
 +
      } else {
 +
        for(int d = 0; d < dims; d++) {
 +
          final double coord = ex.domain[d];
 +
          final double oldMean = cursor.exMean[d], newMean;
 +
          cursor.exMean[d] = newMean =
 +
            oldMean + (coord - oldMean)/nExs;
 +
          cursor.exSumSqDev[d] = cursor.exSumSqDev[d]
 +
            + (coord - oldMean)*(coord - newMean);
 +
        }
 +
      }
 +
 +
      // Check that data are still uniform
 +
      if (cursor.singularity) {
 +
        final Queue<X> cExs = cursor.data;
 +
        if (cExs.size() > 0 && !ex.collocated(cExs.peek()))
 +
          cursor.singularity = false;
 +
      }
 +
 +
      // Finished walking
 +
      return cursor;
 +
    }
 +
  }
 +
  throw new RuntimeException("Walked tree without adding anything");
 +
}
  
    // This might take several tries, so we copy our exemplars to
+
To update the bounding hyperrect for the contents of a tree, we iterate through each dimension, extending (if necessary) the bounds in that dimension to contain the new exemplar.
    // an array to speed up process of picking a random point
 
    Object[] exs = tree.exemplars.toArray();
 
    while (leftSize == treeSize || leftSize == 0) {
 
      leftExs.clear();
 
      rightExs.clear();
 
  
      splitDim = (int)
+
private static <T extends Exemplar> void
        Math.floor(Math.random()*tree.dimensions);
+
updateBounds(OptKdTree<T> tree, Exemplar ex) {
      final int splitPtIdx = (int)
+
  final int dims = ex.domain.length;
        Math.floor(Math.random()*exs.length);
+
  if (tree.contentMax == null) {
      // Cast is inevitable consequence of java's inability to
+
    tree.contentMax = Arrays.copyOf(ex.domain, dims);
      // create a generic array
+
    tree.contentMin = Arrays.copyOf(ex.domain, dims);
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
+
  } else {
      for(T s : tree.exemplars) {
+
    for(int d = 0; d < dims; d++) {
        if (s.domain[splitDim] <= splitValue)
+
      final double dimVal = ex.domain[d];
          leftExs.add(s);
+
      if (dimVal > tree.contentMax[d])
        else
+
        tree.contentMax[d] = dimVal;
          rightExs.add(s);
+
      else if (dimVal < tree.contentMin[d])
      }
+
        tree.contentMin[d] = dimVal;
      leftSize = leftExs.size();
+
    }
    }
+
  }
  }
+
}
  
  // We have found a valid split. Start building our sub-trees
+
===Splitting===
  final BucketKdTree<T> left =
+
We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.
    new BucketKdTree<T>(tree.bucketSize, tree.dimensions);
 
  final BucketKdTree<T> right =
 
    new BucketKdTree<T>(tree.bucketSize, tree.dimensions);
 
  left.addAll(leftExs);
 
  right.addAll(rightExs);
 
  
  // Finally, commit the split
+
private static <T extends Exemplar> boolean
  tree.splitDim = splitDim;
+
shouldSplit(OptKdTree<T> tree) {
  tree.split = splitValue;
+
  return tree.data.size() > tree.bucketSize
  tree.left = left;
+
    && !tree.singularity;
  tree.right = right;
+
}
  
  // Let go of exemplars (and their running stats) held in this leaf
+
Thanks to our pre-computation, splitting is usually straight-forward. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension. This strategy might not successfully split the tree if exemplars are so close together that rounding errors in computing the mean result in the mean not lying strictly between the exemplars in a leaf. When this happens, we resort to repeatedly trying a dimension and an exemplar at random for our splitting point. We only call <code>split</code> if the leaf is not a singularity, so eventually we will find a point that does divide our exemplars.
  tree.exemplars = null;
 
  tree.exMean = tree.exSumSqDev = null;
 
}
 
</pre>
 
  
==Searching==
+
To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. Once we have found a successful split point, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.
Before we can start searching, we need to define our distance metric. We will be using euclidean distance. Optimisations may have to be revised if you are using a different distance metric.
 
  
The code here is optimised because it will be called a lot during searching. The most significant optimisation was not using <code>Math.pow</code>, which [http://www.netlib.org/fdlibm/e_pow.c must do some extra work before it can return with a multiplication].
+
// Split properties. Not initialised until split occurs.
<pre>
+
private int splitDim = 0;
// Gets distance from target of nearest point on hyper-rect defined
+
private double split = Double.NaN;
// by supplied min and max bounds
+
private static double
+
@SuppressWarnings("unchecked") private static <T extends Exemplar> void
minDistanceSqFrom(double[] target, double[] min, double[] max) {
+
split(OptKdTree<T> tree) {
  // Note: profiling shows this is called lots of times, so it pays
+
  assert !tree.singularity;
  // to be well optimised
+
  // Find dimension with largest variance to split on
  double distanceSq = 0;
+
  double largestVar = -1;
  for(int d = 0; d < target.length; d++) {
+
  int splitDim = 0;
    final double coord = target[d];
+
  for(int d = 0; d < tree.dimensions(); d++) {
    double nearCoord;
+
    // Don't need to divide by number of data to find largest
    if (((nearCoord = min[d]) > coord)
+
    // variance
      || ((nearCoord = max[d]) < coord))
+
    final double var = tree.exSumSqDev[d];
    {
+
    if (var > largestVar) {
      final double dst = nearCoord - coord;
+
      largestVar = var;
      distanceSq += dst*dst;
+
      splitDim = d;
    }
+
    }
  }
+
  }
  return distanceSq;
+
}
+
  // Find mean as position for our split
 +
  double splitValue = tree.exMean[splitDim];
 +
 +
  // Check that our split actually splits our data. This also lets
 +
  // us bulk load data into sub-trees, which is more likely
 +
  // to keep optimal balance.
 +
  final Queue<T> leftExs = new LinkedList<T>();
 +
  final Queue<T> rightExs = new LinkedList<T>();
 +
  for(T s : tree.data) {
 +
    if (s.domain[splitDim] <= splitValue)
 +
      leftExs.add(s);
 +
    else
 +
      rightExs.add(s);
 +
  }
 +
  int leftSize = leftExs.size();
 +
  final int treeSize = tree.data.size();
 +
  if (leftSize == treeSize || leftSize == 0) {
 +
    System.err.println(
 +
      "WARNING: Randomly splitting non-uniform tree");
 +
    // We know the data aren't all the same, so try picking
 +
    // an exemplar and a dimension at random for our split point
 +
 +
    // This might take several tries, so we copy our data to
 +
    // an array to speed up process of picking a random point
 +
    Object[] exs = tree.data.toArray();
 +
    while (leftSize == treeSize || leftSize == 0) {
 +
      leftExs.clear();
 +
      rightExs.clear();
 +
 +
      splitDim = (int)
 +
        Math.floor(Math.random()*tree.dimensions());
 +
      final int splitPtIdx = (int)
 +
        Math.floor(Math.random()*exs.length);
 +
      // Cast is inevitable consequence of java's inability to
 +
      // create a generic array
 +
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
 +
      for(T s : tree.data) {
 +
        if (s.domain[splitDim] <= splitValue)
 +
          leftExs.add(s);
 +
        else
 +
          rightExs.add(s);
 +
      }
 +
      leftSize = leftExs.size();
 +
    }
 +
  }
 +
 +
  // We have found a valid split. Start building our sub-trees
 +
  final OptKdTree<T> left = new OptKdTree<T>(tree.bucketSize);
 +
  final OptKdTree<T> right = new OptKdTree<T>(tree.bucketSize);
 +
  left.addAll(leftExs);
 +
  right.addAll(rightExs);
 +
 +
  // Finally, commit the split
 +
  tree.splitDim = splitDim;
 +
  tree.split = splitValue;
 +
  tree.left = left;
 +
  tree.right = right;
 +
 +
  // Let go of data (and their running stats) held in this leaf
 +
  tree.data.clear();
 +
  tree.exMean = tree.exSumSqDev = null;
 +
}
  
// Accessible to testing
+
===Distance calculations===
static double
+
Before we can start searching, we need to define our distance metric. The distance calculation has already been introduced before.
distanceSqFrom(double[] p1, double[] p2) {
 
  // Note: profiling shows this is called lots of times, so it pays
 
  // to be well optimised
 
  double dSq = 0;
 
  for(int d = 0; d < p1.length; d++) {
 
    final double dst = p1[d] - p2[d];
 
    if (dst != 0)
 
      dSq += dst*dst;
 
  }
 
  return dSq;
 
}
 
</pre>
 
  
Before we dive into searching, we will introduce two helper classes.
+
:Squared Euclidean distance:
* <code>SearchStackEntry</code> stores a tree along with its minimum distance from the query point. We put these on our tree-walking stack instead of just trees so we don't have to re-compute this minimum distance when we next pop the tree.
+
:<math>distance_{x\rightarrow y}^2 = \sum (x[d] - y[d])^2</math>
* <code>SearchState</code> holds some variables we want to be updated in after each nearest neighbour is added. Its sole purpose is to let us separate searching sub-trees and leaves into different methods without sacrificing performance.
 
  
<pre>
+
This will be called for every point examined, so we should optimise it well.
  
//
+
private static double
// class SearchStackEntry
+
distanceSqFrom(double[] p1, double[] p2) {
//
+
  double dSq = 0;
// Stores a precomputed distance so we don't have to do it again
+
  for(int d = 0; d < p1.length; d++) {
// when we pop the tree off the search stack.
+
    final double dst = p1[d] - p2[d];
 +
    if (dst != 0)
 +
      dSq += dst*dst;
 +
  }
 +
  return dSq;
 +
}
  
private static class SearchStackEntry<T extends Exemplar> {
+
We will also need to find the minimum distance from a hyperrect to a query point. This will let us rule out trees whose content hyperrects are outside our search sphere. To find the minimum distance between a hyperrect and a given point, we need first find the closest point along bounds of that hyperrect to the point. This is easy to do, even if not immediately obvious.
  public final double minDFromQ;
 
  public final BucketKdTree<T> tree;
 
  
  public SearchStackEntry(double minDFromQ, BucketKdTree<T> tree) {
+
:Along each dimension,
    this.minDFromQ = minDFromQ;
+
:<math>nearestPoint[d]=\begin{cases}
    this.tree = tree;
+
min[d] & target[d]\le min[d]\\
  }
+
target[d] & min<target[d]<max[d]\\
}
+
max[d] & target[d]\ge max[d]\end{cases}</math>
  
//
+
After that, we just calculate the distance from our query point to this nearest point. The code below has been optimised because it gets called a lot during a search.
// class SearchState
 
//
 
// Holds data about current state of the search. Used for live updating
 
// of pruning distance.
 
  
private static class SearchState {
+
// Gets distance from target of nearest point on hyper-rect defined
  int nResults = 0;
+
// by supplied min and max bounds
  double maxDistance = Double.POSITIVE_INFINITY;
+
private static double
  int nExsAtMaxD = 0;
+
minDistanceSqFrom(double[] target, double[] min, double[] max) {
}
+
  double distanceSq = 0;
</pre>
+
  for(int d = 0; d < target.length; d++) {
 +
    if (target[d] < min[d]) {
 +
      final double dst = min[d] - target[d];
 +
      distanceSq += dst*dst;
 +
    } else if (target[d] > max[d]) {
 +
      final double dst = max[d] - target[d];
 +
      distanceSq += dst*dst;
 +
    }
 +
  }
 +
  return distanceSq;
 +
}
  
Our search method continuously pops our search stack, decides whether it's looking at a sub-tree or a leaf and passes the call on as appropriate.
+
===Searching===
 +
Before we dive into searching, we will introduce two helper classes.
 +
* <code>SearchStackEntry</code> stores a tree along any data which will be useful to keep on the search stack. Currently, this additional information consists only of whether a bounds check should be done. This lets us skip a redundant bounds check on the root of the tree.
  
<pre>
+
//
public SortedMap<Double, List<T>>
+
// class SearchStackEntry
search(double[] query, int nMinResults) {
+
//
  // Forward to a static method to avoid accidental reference to
+
  // instance variables while descending the tree
+
private static class SearchStackEntry<T extends Exemplar> {
  return search(this, query, nMinResults);
+
  public final boolean needBoundsCheck;
}
+
  public final OptKdTree<T> tree;
 +
 +
  public SearchStackEntry(boolean needBoundsCheck, OptKdTree<T> tree) {
 +
    this.needBoundsCheck = needBoundsCheck;
 +
    this.tree = tree;
 +
  }
 +
}
  
// May return more results than requested if multiple exemplars have
+
* <code>SearchState</code> stores the current state of the search. Its sole purpose is to let us separate searching sub-trees and leaves into different methods. This breaks up a long method and makes profiling easier.
// same distance from target.
 
//
 
// Note: this function works with squared distances to avoid sqrt()
 
// operations
 
private static <T extends Exemplar> SortedMap<Double, List<T>>
 
search(BucketKdTree<T> tree, double[] query, int nMinResults) {
 
  // distance => list of points that distance away from query
 
  final NavigableMap<Double, List<T>> results =
 
    new TreeMap<Double, List<T>>();
 
  
  final SearchState state = new SearchState();
+
//
  final Deque<SearchStackEntry<T>> stack =
+
// class SearchState
    new LinkedList<SearchStackEntry<T>>();
+
//
  stack.addFirst(new SearchStackEntry<T>(state.maxDistance, tree));
+
// Holds data about current state of the search. Used for live updating
  while (!stack.isEmpty()) {
+
// of pruning distance.
    final SearchStackEntry<T> entry = stack.removeFirst();
+
    final BucketKdTree<T> cur = entry.tree;
+
private static class SearchState<X extends Exemplar> {
 +
  final int nResults;
 +
  final PriorityQueue<PrioNode<X>> results;
 +
 +
  public SearchState(int nResults) {
 +
    this.nResults = nResults;
 +
    results = new PriorityQueue<PrioNode<X>>(nResults,
 +
      new Comparator<PrioNode<X>>() {
 +
 +
        public int
 +
        compare(PrioNode<X> o1, PrioNode<X> o2) {
 +
          return o1.priority == o2.priority ? 0
 +
            : o1.priority > o2.priority ? -1 : 1;
 +
        }
 +
 +
      });
 +
  }
 +
}
  
    if (cur.isTree()) {
+
Our search method does a depth-first search using a stack. Each time we pop our stack, we do a bounds check if necessary, then defer to the appropriate method.
      searchTree(query, nMinResults, cur, state, stack);
 
    } else if (entry.minDFromQ <= state.maxDistance
 
      || state.nResults < nMinResults)
 
    {
 
      searchLeaf(query, nMinResults, cur, state, results);
 
    }
 
  }
 
  
  return results;
+
public Iterable<PrioNode<X>>
}
+
search(double[] query, int nResults) {
</pre>
+
  // Forward to a static method to avoid accidental reference to
 +
  // instance variables while descending the tree
 +
  return search(this, query, nResults);
 +
}
 +
 +
private static <X extends Exemplar> Iterable<PrioNode<X>>
 +
search(OptKdTree<X> tree, double[] query, int nResults) {
 +
  final SearchState<X> state = new SearchState<X>(nResults);
 +
  final Deque<SearchStackEntry<X>> stack =
 +
    new LinkedList<SearchStackEntry<X>>();
 +
  if (tree.contentMin != null)
 +
    stack.addLast(new SearchStackEntry<X>(false, tree));
  
===Searching a sub-tree===
+
  while (!stack.isEmpty()) {
Here is where we see two of our optimisations come into play. For each non-empty sub-tree, we compute the minimum distance from the smallest hyperrect bounding that sub-tree's exemplars to our query point. If this distance exceeds our maximum distance, we can rule out the entire sub-tree. Of course, if we haven't found our k-potential nearest neighbours yet, then both sub-trees may contain potential nearest neighbours so we will add them to the search stack anyway.
+
    final SearchStackEntry<X> entry = stack.removeLast();
 +
    final OptKdTree<X> cur = entry.tree;
 +
 +
    if (entry.needBoundsCheck && state.results.size() >= nResults) {
 +
      final double d = minDistanceSqFrom(query,
 +
        cur.contentMin, cur.contentMax);
 +
      if (d > state.results.lastKey())
 +
        continue;
 +
    }
 +
 +
    if (cur.isTree()) {
 +
      searchTree(query, cur, stack, state.results.peek().priority);
 +
    } else {
 +
      searchLeaf(query, cur, state);
 +
    }
 +
  }
 +
 +
  return state.results;
 +
}
  
As a heuristic, we want to search the sub-tree with an edge closest to the query point first. This is more likely to lead to earlier contraction of our search distance. This is why we add the farthest tree to our stack first.
+
====Searching a Tree====
 +
We basically want to add both sub-trees to the stack, if they are not empty. As a heuristic, we think the sub-tree on the same side of the split as our query point is more likely to contain closer points, so we search that sub-tree first by adding it to the stack last. You might try to avoid a distance calculation by not requiring a bounds check for the heuristically nearer sub-tree, but that does not seem to significantly improve performance.
  
<pre>
+
private static <X extends Exemplar> void
private static <T extends Exemplar> void
+
searchTree(double[] query, OptKdTree<X> tree,
searchTree(double[] query, int nMinResults, BucketKdTree<T> tree,
+
  Deque<SearchStackEntry<X>> stack, double maxPriority)
  SearchState searchState, Deque<SearchStackEntry<T>> stack)
+
{
{
+
  OptKdTree<X> nearTree = tree.left, farTree = tree.right;
  // Left is presumed near. This is verified further down.
+
  if (query[tree.splitDim] > tree.split) {
  BucketKdTree<T> nearTree = tree.left, farTree = tree.right;
+
    nearTree = tree.right;
  // These variables let us skip empty sub-trees
+
    farTree = tree.left;
  boolean nearEmpty = nearTree.minBounds == null;
+
  }
  boolean farEmpty = farTree.minBounds == null;
+
 +
  // These variables let us skip empty sub-trees
 +
  boolean nearEmpty = nearTree.contentMin == null;
 +
  boolean farEmpty = farTree.contentMin == null;
 +
 +
  // Add nearest sub-tree to stack later so we descend it
 +
  // first. This is likely to constrict our max distance
 +
  // sooner, resulting in less visited nodes
 +
  if (!farEmpty && minDistanceSqFrom(query,farTree.contentMin,farTree.contentMax) < maxPriority) {
 +
    stack.addLast(new SearchStackEntry<X>(true, farTree));
 +
  }
 +
 +
  if (!nearEmpty && minDistanceSqFrom(query,nearTree.contentMin,nearTree.contentMax) < maxPriority) {
 +
    stack.addLast(new SearchStackEntry<X>(true, nearTree));
 +
  }
 +
}
  
  // Find distance from nearest possible point in each
+
====Searching a Leaf====
  // sub-tree to query. If that is greater than max distance,
+
By the time we reach a leaf, there is little left to optimise away. We must go through each exemplar in turn to see if it is a nearest neighbour to our query point.
  // we can rule out that sub-tree.
 
  double nearD = nearEmpty ? Double.POSITIVE_INFINITY
 
    : minDistanceSqFrom(query,
 
      nearTree.minBounds, nearTree.maxBounds);
 
  double farD = farEmpty ? Double.POSITIVE_INFINITY
 
    : minDistanceSqFrom(query,
 
      farTree.minBounds, farTree.maxBounds);
 
  
  // Swap near and far if they're incorrect
+
private static <X extends Exemplar> void
  if (farD < nearD) {
+
searchLeaf(double[] query, OptKdTree<X> leaf, SearchState<X> state) {
    final double tmpD = nearD;
+
  double exD = Double.NaN;
    final BucketKdTree<T> tmpTree = nearTree;
+
  for(X ex : leaf.data) {
    final boolean tmpEmpty = nearEmpty;
+
    if (!leaf.singularity || Double.isNaN(exD)) {
 +
      exD = distanceSqFrom(query, ex.domain);
 +
    }
 +
 +
    if (examine(exD, state)) {
 +
      while (state.results.size() >= state.nResults) {
 +
        state.results.poll();
 +
      }
 +
 +
      state.results.offer(new PrioNode<X>(exD, ex));
 +
    }
 +
  }
 +
}
 +
 +
private static <X extends Exemplar> boolean
 +
examine(double distance, SearchState<X> state) {
 +
  return state.results.size() < state.nResults
 +
    || distance < state.results.peek().priority;
 +
}
  
    nearD = farD;
+
===Evaluation===
    nearTree = farTree;
+
From the same benchmark run, our optimised tree obtains the following results:
    nearEmpty = farEmpty;
 
  
    farD = tmpD;
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
    farTree = tmpTree;
+
: Average searching time      = 0.5740 miliseconds
    farEmpty = tmpEmpty;
+
: Average worst searching time = 2.3482 miliseconds
   }
+
: Average adding time          = 0.4528 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
[...]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
[...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
   - #9 Voidious' Linear search [0.5740]
 +
 +
This is promising. We are a whole order of magnitude faster and finally in the same league as other ''k''d-tree implementations on this wiki. We have not changed our API, so the following shim is only for completeness:
  
  // Add nearest sub-tree to stack later so we descend it
+
* DuynOptKNNSearch.java
  // first. This is likely to constrict our max distance
 
  // sooner, resulting in less visited nodes
 
  if (!farEmpty && (farD <= searchState.maxDistance
 
                    || searchState.nResults < nMinResults))
 
  {
 
    stack.addFirst(new SearchStackEntry<T>(farD, farTree));
 
  }
 
  
  if (!nearEmpty && (nearD <= searchState.maxDistance
+
public class DuynOptKNNSearch extends KNNImplementation {
                    || searchState.nResults < nMinResults))
+
  final OptKdTree<StringExemplar> tree;
  {
+
  public DuynOptKNNSearch(final int dimension) {
    stack.addFirst(new SearchStackEntry<T>(nearD, nearTree));
+
    super(dimension);
  }
+
    tree = new OptKdTree<StringExemplar>();
}
+
  }
</pre>
+
 +
  @Override public void
 +
  addPoint(final double[] location, final String value) {
 +
    tree.add(new StringExemplar(location, value));
 +
  }
 +
 +
  @Override public String
 +
  getName() {
 +
    return "duyn's optimised kd-tree";
 +
  }
 +
 +
  @Override
 +
  public KNNPoint[]
 +
  getNearestNeighbors(final double[] location, final int size) {
 +
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 +
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
 +
      final double distance = sr.priority;
 +
      final StringExemplar pt = sr.data;
 +
      justPoints.add(new KNNPoint(pt.value, distance));
 +
    }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
    return justPoints.toArray(retVal);
 +
  }
 +
 +
  class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
  }
 +
}
  
===Searching a leaf===
+
==A Better Results Heap==
When we reach a leaf, we finally have a chance to contract our search distance. If we haven't yet found our k-potential nearest neighbours yet, or if the exemplar we're checking lies on the edge of our max search distance, we blindly add exemplars and update our max search distance.
+
A TreeMap does unnecessary processing, keeping all its keys in order when all we really care about is the farthest tentative result we have found. By implementing a custom heap, we can squeeze even more performance out of our tree.
  
Otherwise, we check the exemplar's distance from our query point. If it's closer than our furthest potential match, we drop our furthest result (unless doing so would result in too little k-potential neighbours) to make room for our new neighbour. Finally, we update our max search distance so we can skip any trees or points outside our new search distance.
+
In this section, we will implement an implicit binary heap. This structure does not have the best theoretical performance, but often does well in practice. More theoretically efficient heaps often rely on batching inserts, only fully processing them when a the top element is removed. Since we almost do one removal for every insert, this does not offer much benefit to us. For our usage pattern, the only heap faster than a binary heap is a buggy heap.
  
If all the exemplars in a leaf are uniform, we don't need to do this check for every exemplar. We just do it for the first one, and either add or reject them all.
+
===Implicit d-ary Heap===
 +
Implicit binary heaps can be very memory efficient since all elements are stored in an array. They don't grow as easily as other heaps, but for our purposes that is not a problem&mdash;we can set a max on the number of results we are interested in. A good description of implicit heaps is provided by John Bentley in [http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artikler/02/ Programming Pearls: Thanks, Heaps].
  
<pre>
+
Below, we implement a d-ary heap. The math is similar to a binary heap, and there are some suggestions that a 4-way heap may be more efficient than a binary heap. Our heap can be used as a min- or a max-heap. Using the enum <code>Direction</code> ensures the user doesn't get this parameter wrong.
private static <T extends Exemplar> void
 
searchLeaf(double[] query, int nMinResults,
 
  BucketKdTree<T> leaf, SearchState searchState,
 
  NavigableMap<Double, List<T>> results)
 
{
 
  // Keep track of elements at max distance so we know
 
  // whether we can just drop entire list of furthest
 
  // exemplars
 
  final int nMinResultsBeforeAddition = nMinResults - 1;
 
  for(T ex : leaf.exemplars) {
 
    final double exD = distanceSqFrom(query, ex.domain);
 
    if (searchState.nResults < nMinResults
 
      || exD == searchState.maxDistance)
 
    {
 
      // Blindly add this exemplar
 
      List<T> exsAtD = getOrElseInit(results, exD);
 
      if (leaf.exemplarsAreUniform) {
 
        // No need to go through every one if all
 
        // exemplars are the same
 
        exsAtD.addAll(leaf.exemplars);
 
        searchState.nResults += leaf.exemplars.size();
 
      } else {
 
        exsAtD.add(ex);
 
        searchState.nResults++;
 
      }
 
  
      // Update information about furthest exemplars
+
// Final for performance, may be removed if sub-classing necessary
      if (exD > searchState.maxDistance
+
public final class FastBinaryHeap<T> implements Iterable<PrioNode<T>> {
        || searchState.maxDistance == Double.POSITIVE_INFINITY)
+
  private final int branches;
      {
+
  // Direction = -1 or 1
        searchState.maxDistance = exD;
+
  // root * direction > element * direction
      }
+
  private final int direction;
 +
  public static enum Direction {
 +
    MIN(-1), MAX(1);
 +
 +
    public final int value;
 +
    private Direction(int value) {
 +
      this.value = value;
 +
    }
 +
  }
 +
  public static final Direction MAX = Direction.MAX, MIN = Direction.MIN;
 +
 +
  private final PrioNode<T>[] data;
 +
  private int size = 0;
 +
  private static final int DEFAULT_N_CHILDREN = 4;
 +
 +
  public FastBinaryHeap(int capacity, Direction direction) {
 +
    this(capacity, DEFAULT_N_CHILDREN, direction);
 +
  }
 +
 +
  @SuppressWarnings("unchecked")
 +
  public FastBinaryHeap(int capacity, int branches, Direction dir) {
 +
    this.branches = branches;
 +
    // This implicit conversion is safe because we only refer to
 +
    // this array through our data member.
 +
    this.data = new PrioNode[capacity];
 +
    this.direction = dir.value;
 +
  }
 +
}
  
      if (searchState.maxDistance == exD)
+
We implement an interface similar to the standard PriorityQueue:
        searchState.nExsAtMaxD = exsAtD.size();
 
  
    } else if (exD < searchState.maxDistance) {
+
public Iterator<PrioNode<T>>
      // Point closer than furthest neighbour
+
iterator() {
 +
  return new Iterator<PrioNode<T>>() {
 +
 +
    int idx = 0;
 +
 +
    public boolean
 +
    hasNext() { return idx < size; }
 +
 +
    public PrioNode<T>
 +
    next() { return data[idx++]; }
 +
 +
    public void
 +
    remove() {
 +
      throw new UnsupportedOperationException("Not supported.");
 +
    }
 +
 +
  };
 +
}
 +
 +
public int
 +
size() { return size; }
 +
 +
public PrioNode<T>
 +
peek() { return data[0]; }
 +
 +
Our offer() method takes a priority separately from the data we want to associate with it. We will store the priority separately so we don't fall into the same performance trap as PriorityQueue. We do not implement a poll() method because we don't need it. If we are about to overflow, we will just replace our top element with the new one.
  
      if (searchState.nResults - searchState.nExsAtMaxD
+
public boolean
        >= nMinResultsBeforeAddition)
+
offer(double priority, T result) {
      {
+
  if (size < data.length) {
        // Dropping furthest exemplars won't leave
+
    // Trivial case
        // us with too little to meet return
+
    siftUp(size, new PrioNode<T>(priority, result));
        // minimum
+
    size++;
        results.remove(searchState.maxDistance);
+
    return true;
        searchState.nResults -= searchState.nExsAtMaxD;
+
  }
      }
+
 +
  if (priority*direction > data[0].priority*direction) {
 +
    System.err.printf("Do not want %f when have %f\n",
 +
      priority, data[0].priority);
 +
    return false;
 +
  }
 +
 +
  siftDown(0, new PrioNode<T>(priority, result));
 +
  return true;
 +
}
  
      // Add new exemplar
+
Now to the meat of our heap.
      List<T> exsAtD = getOrElseInit(results, exD);
 
      if (leaf.exemplarsAreUniform) {
 
        // No need to go through every one if all
 
        // exemplars are the same
 
        exsAtD.addAll(leaf.exemplars);
 
        searchState.nResults += leaf.exemplars.size();
 
      } else {
 
        exsAtD.add(ex);
 
        searchState.nResults++;
 
      }
 
  
      // Update information about furthest exemplars
+
* siftUp treats the given index as empty and progressively movies parents into the hole left behind if inserting the given element there would violate heap order.
      Map.Entry<Double, List<T>> lastEnt = results.lastEntry();
+
: Parents are calculated as follows:
      searchState.maxDistance = lastEnt.getKey();
+
: <math>parent[i] = \left\lfloor {i - 1\over branches} \right\rfloor</math>
      searchState.nExsAtMaxD = lastEnt.getValue().size();
 
    } // exD < maxDistance
 
  
    // No need to keep going if all exemplars are
+
: Once we have that, we can compare the values at both indexes. Multiplying them by <code>direction</code> lets us use the same code for both a min- and a max-heap.
    // the same
+
    if (leaf.exemplarsAreUniform) break;
+
private void
  } // for(T ex : cur.exemplars)
+
siftUp(int hole, PrioNode<T> value) {
}
+
  if (size > 0) {
 +
    final double vPD = value.priority*direction;
 +
    while (hole > 0) {
 +
      final int parent = (hole - 1)/branches;
 +
      if (vPD > data[parent].priority*direction) {
 +
        data[hole] = data[parent];
 +
        hole = parent;
 +
      } else {
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 +
  data[hole] = value;
 +
}
  
private static <T extends Exemplar> List<T>
+
* siftDown does the opposite, starting from the given index and moving elements up if inserting the new value into the hole would violate heap order. It swaps the element with its worst heap-offending child so that with a single swap, it can restore heap order to that level.
getOrElseInit(Map<Double, List<T>> results, double dst) {
+
: In a 0-based index, children are calculated as follows:
  List<T> lst = results.get(dst);
+
: <math>\begin{align}child_{1}[i] & =branches\times i+1\\
  if (lst == null) {
+
\ldots\\
    lst = new LinkedList<T>();
+
child_{branches}[i] & =branches\times i+branches\end{align}</math>
    results.put(dst, lst);
 
  }
 
  return lst;
 
}
 
</pre>
 
  
==Assessment==
+
private void
Initial performance benchmarks are promising. Randomised JUnit tests indicate (not very rigorously) that searching can be 2&ndash;7x faster than a linear search. Using the [[k-NN algorithm benchmark]] with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data] yields the following performance:
+
siftDown(int hole, PrioNode<T> value) {
 +
  if (size > 0) {
 +
    // Push down along min path
 +
    final double vPD = value.priority*direction;
 +
    int firstChild;
 +
    while ((firstChild = branches*hole + 1) < data.length) {
 +
 +
      // Find largest/smallest child
 +
      double swapPD = data[firstChild].priority*direction;
 +
      int swapChild = firstChild;
 +
      for(int c = firstChild + 1;
 +
        c < firstChild + branches && c < size; c++)
 +
      {
 +
        final double cPD = data[c].priority*direction;
 +
        if (cPD > swapPD) {
 +
          swapPD = cPD;
 +
          swapChild = c;
 +
        }
 +
      }
 +
 +
      if (swapPD > vPD) {
 +
        data[hole] = data[swapChild];
 +
        hole = swapChild;
 +
      } else {
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 +
  data[hole] = value;
 +
}
  
<pre>
+
Both of these are slightly different from standard textbook presentations of the algorithm. The versions here do less swaps by shifting other elements out of the way until the new element's rightful place has been found.
K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
 
-----------------------------------------
 
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
 
Read 25621 saves and 10300 searches.
 
  
Running 30 repetition(s) for k-nearest neighbours searching:
+
===Changes to our Tree===
:: 13 dimension(s); 40 neighbour(s)
+
Because the only change we have made is to the method of collecting results, we will skim over the changes necessary.
Warming up the JIT with 5 repetitions first...
 
  
Running tests... COMPLETED.
+
public final class FastKdTree<X extends Exemplar> {
 +
....
 +
}
  
RESULT << k-nearest neighbours search with Voidious' Linear search >>
+
* Our SearchState will have to be updated to use our new heap:
: Average searching time      = 0.57 miliseconds
 
: Average worst searching time = 14.992 miliseconds
 
: Average adding time          = 0.55 microseconds
 
: Accuracy                    = 100%
 
  
RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
+
//
: Average searching time      = 0.16 miliseconds
+
// class SearchState
: Average worst searching time = 20.109 miliseconds
+
//
: Average adding time          = 1.41 microseconds
+
// Holds data about current state of the search. Used for live updating
: Accuracy                    = 100%
+
// of pruning distance.
 +
 +
private static class SearchState<X extends Exemplar> {
 +
  final int nResults;
 +
  final FastBinaryHeap<X> results;
 +
 +
  public SearchState(int nResults) {
 +
    this.nResults = nResults;
 +
    results = new FastBinaryHeap<X>(
 +
      nResults, 4, FastBinaryHeap.MAX);
 +
  }
 +
}
  
RESULT << k-nearest neighbours search with Nat's Bucket PR k-d tree >>
+
* Our searchLeaf() method no longer needs to explicitly poll the top since our heap will just replace it if it has no space:
: Average searching time      = 0.361 miliseconds
 
: Average worst searching time = 304.663 miliseconds
 
: Average adding time          = 66.64 microseconds
 
: Accuracy                    = 31%
 
  
RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
+
private static <X extends Exemplar> void
: Average searching time      = 0.216 miliseconds
+
searchLeaf(double[] query, FastKdTree<X> leaf, SearchState<X> state) {
: Average worst searching time = 100.556 miliseconds
+
  double exD = Double.NaN;
: Average adding time          = 1.58 microseconds
+
  for(X ex : leaf.data) {
: Accuracy                    = 100%
+
    if (!leaf.singularity || Double.isNaN(exD)) {
 +
      exD = distanceSqFrom(query, ex.domain);
 +
    }
 +
 +
    if (examine(exD, state)) {
 +
      state.results.offer(exD, ex);
 +
    }
 +
  }
 +
}
  
RESULT << k-nearest neighbours search with duyn's Bucket kd-tree >>
+
* We have also replaced all instances of LinkedLists with ArrayDeques. This is because although theoretically LinkedLists are appropriate, the javadocs for ArrayDeque state that:
: Average searching time      = 0.079 miliseconds
+
: "This class is likely to be faster than Stack when used as a stack, and faster than LinkedList  when used as a queue."&mdash;[http://java.sun.com/javase/6/docs/api/java/util/ArrayDeque.html Java 6 API docs]
: Average worst searching time = 17.476 miliseconds
 
: Average adding time          = 2.45 microseconds
 
: Accuracy                    = 100%
 
  
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree >>
+
public FastKdTree(int bucketSize) {
: Average searching time      = 0.053 miliseconds
+
  this.bucketSize = bucketSize;
: Average worst searching time = 0.296 miliseconds
+
  this.data = new ArrayDeque<X>();
: Average adding time          = 1.69 microseconds
+
}
: Accuracy                    = 100%
+
 +
@SuppressWarnings("unchecked") private static <X extends Exemplar> void
 +
split(FastKdTree<X> tree) {
 +
  ...
 +
  final Queue<X> leftExs = new ArrayDeque<X>();
 +
  final Queue<X> rightExs = new ArrayDeque<X>();
 +
  ...
 +
}
 +
 +
private static <X extends Exemplar> Iterable<PrioNode<X>>
 +
search(FastKdTree<X> tree, double[] query, int nResults) {
 +
  final SearchState<X> state = new SearchState<X>(nResults);
 +
  final Deque<SearchStackEntry<X>> stack =
 +
    new ArrayDeque<SearchStackEntry<X>>();
 +
  ...
 +
}
  
 +
===Evaluation===
 +
Our improved heap squeezes a bit more performance (~10%) out of our tree:
  
BEST RESULT:  
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
  - #1 Rednaxela's Bucket kd-tree [0.0529]
+
: Average searching time      = 0.5740 miliseconds
  - #2 duyn's Bucket kd-tree [0.0787]
+
: Average worst searching time = 2.3482 miliseconds
  - #3 Simonton's Bucket PR k-d tree [0.1598]
+
: Average adding time          = 0.4528 microseconds
- #4 Voidious' Bucket PR k-d tree [0.2165]
+
: Accuracy                    = 100.0%
  - #5 Nat's Bucket PR k-d tree [0.3605]
+
- #6 Voidious' Linear search [0.5699]
+
[...]
 +
   
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
 +
: Average searching time      = 0.0479 miliseconds
 +
: Average worst searching time = 25.0995 miliseconds
 +
: Average adding time          = 2.1288 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
   
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
  [...]
 +
  - #3 duyn's fast kd-tree [0.0479]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
  [...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
  
Benchmark running time: 545.43 seconds
+
Our client code has barely changed:
</pre>
 
  
This is not an exhaustive test, but it shows we're off to a nice start. The following code was used to run the benchmark:
+
* DuynFastKNNSearch.java
 
+
* KNNBenchmark.java
+
public class DuynFastKNNSearch extends KNNImplementation {
  <pre>
+
  final FastKdTree<StringExemplar> tree;
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
+
  public DuynFastKNNSearch(final int dimension) {
  final Class<?>[] searchAlgorithms = new Class<?>[] {
+
    super(dimension);
     FlatKNNSearch.class,
+
    tree = new FastKdTree<StringExemplar>();
    SimontonTreeKNNSearch.class,
+
  }
     NatTreeKNNSearch.class,
+
     VoidiousTreeKNNSearch.class,
+
  @Override public void
+   DuynTreeImpl.class,
+
  addPoint(final double[] location, final String value) {
    RednaxelaTreeKNNSearch.class,
+
    tree.add(new StringExemplar(location, value));
  };
+
  }
   ...
+
 +
  @Override public String
 +
  getName() {
 +
    return "duyn's fast kd-tree";
 +
  }
 +
 +
  @Override public KNNPoint[]
 +
  getNearestNeighbors(final double[] location, final int size) {
 +
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 +
     for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
 +
      final double distance = sr.priority;
 +
      final StringExemplar pt = sr.data;
 +
      justPoints.add(new KNNPoint(pt.value, distance));
 +
     }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
     return justPoints.toArray(retVal);
 +
  }
 +
 +
   class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
   }
 
  }
 
  }
  </pre>
 
  
* DuynTreeImpl.java
+
==Full Benchmark Results==
  <pre>
+
Below are the full benchmark results quoted throughout this page. It puts the performance of our trees in context compared to other ''k''d-trees available on this wiki. The benchmark ([[k-NN algorithm benchmark]]) was run with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data].
package net.robowiki.knn.implementations;
 
import net.robowiki.knn.util.KNNPoint;
 
import java.util.*;
 
// Namespace I use for my kd-tree
 
import dn.j1.algorithm.nearestneighbours.*;
 
  
public class DuynTreeImpl extends KNNImplementation {
+
K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
   final BucketKdTree<KNNPointAdapter> tree;
+
-----------------------------------------
  public DuynTreeImpl(final int dimension) {
+
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
    super(dimension);
+
Read 25621 saves and 10300 searches.
    tree = new BucketKdTree<KNNPointAdapter>(10, dimension);
+
   }
+
Running 30 repetition(s) for k-nearest neighbours searching:
 +
:: 13 dimension(s); 40 neighbour(s)
 +
Warming up the JIT with 5 repetitions first...
 +
 +
Running tests...test 1/30
 +
[...]
 +
   COMPLETED.
 +
 +
RESULT << k-nearest neighbours search with Voidious' Linear search >>
 +
: Average searching time      = 0.5740 miliseconds
 +
: Average worst searching time = 2.3482 miliseconds
 +
: Average adding time          = 0.4528 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
 +
: Average searching time      = 0.1609 miliseconds
 +
: Average worst searching time = 1.4184 miliseconds
 +
: Average adding time          = 1.3322 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
 +
: Average searching time      = 0.2081 miliseconds
 +
: Average worst searching time = 21.9191 miliseconds
 +
: Average adding time          = 1.0301 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (3rd gen) >>
 +
: Average searching time      = 0.0407 miliseconds
 +
: Average worst searching time = 0.2259 miliseconds
 +
: Average adding time          = 1.6342 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
 +
: Average searching time      = 0.0479 miliseconds
 +
: Average worst searching time = 25.0995 miliseconds
 +
: Average adding time          = 2.1288 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Chase-san's kd-tree (KDTreeC) >>
 +
: Average searching time      = 0.1200 miliseconds
 +
: Average worst searching time = 19.5727 miliseconds
 +
: Average adding time          = 1.4222 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
  - #2 Rednaxela's kd-tree (3rd gen) [0.0407]
 +
  - #3 duyn's fast kd-tree [0.0479]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
  - #5 Chase-san's kd-tree (KDTreeC) [0.1200]
 +
  - #6 Simonton's Bucket PR k-d tree [0.1609]
 +
  - #7 Voidious' Bucket PR k-d tree [0.2081]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
   - #9 Voidious' Linear search [0.5740]
 +
 +
Benchmark running time: 587.77 seconds
  
  @Override public void
+
==Full Source Code==
  addPoint(final double[] location, final String value) {
+
The full source code for all three trees presented in this tutorial can be downloaded together: [[File:duyn_tutorial_kd_trees.jar]]. The code is available under the [[RWPCL]].
    tree.add(new KNNPointAdapter(location, value));
 
  }
 
 
 
  @Override public String
 
  getName() {
 
    return "duyn's Bucket kd-tree";
 
  }
 
 
 
  @Override public KNNPoint[]
 
  getNearestNeighbors(final double[] location, final int size) {
 
    final SortedMap<Double, List<KNNPointAdapter>> results = tree.search(location, size);
 
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 
    for(final Map.Entry<Double, List<KNNPointAdapter>> entry : results.entrySet()) {
 
      final double distance = entry.getKey();
 
      for(final KNNPointAdapter pt : entry.getValue()) {
 
        justPoints.add(new KNNPoint(pt.value, distance));
 
      }
 
    }
 
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 
    return justPoints.toArray(retVal);
 
  }
 
 
 
  class KNNPointAdapter extends Exemplar {
 
    public final String value;
 
    public KNNPointAdapter(final double[] coords, final String value) {
 
      super(coords);
 
      this.value = value;
 
    }
 
  }
 
}
 
  </pre>
 
  
 
==Extensions==
 
==Extensions==
This tree does not include any support for deletion or re-balancing because it was written to be used with standardised data. This would require periodic re-normalisation and thus re-builds, making deletion/re-balancing unnecessary.
+
This tree does not include any support for deletion or re-balancing. You can get away with this if you plan to re-build the entire tree regularly.
  
 
'''Deletion''' support would be easy to add to this tree. Since all exemplars are stored in leaves at the bottom, we need only walk the tree and remove the exemplar from the leaf containing it. We would then merge sub-trees if multiple deletions has left one of them empty. Unless our bucket width is really small, we don't need to merge sub-trees after each deletion since the addition of a subsequent exemplar is unlikely to result in choosing a significantly different split point.
 
'''Deletion''' support would be easy to add to this tree. Since all exemplars are stored in leaves at the bottom, we need only walk the tree and remove the exemplar from the leaf containing it. We would then merge sub-trees if multiple deletions has left one of them empty. Unless our bucket width is really small, we don't need to merge sub-trees after each deletion since the addition of a subsequent exemplar is unlikely to result in choosing a significantly different split point.
Line 693: Line 1,226:
 
'''Re-balancing''' with this tree is trickier. It does not currently store depth, so we don't know when a re-balance is needed. Because exemplars are only stored in leaves, collecting all the exemplars under a sub-tree would require exhaustively walking that sub-tree.
 
'''Re-balancing''' with this tree is trickier. It does not currently store depth, so we don't know when a re-balance is needed. Because exemplars are only stored in leaves, collecting all the exemplars under a sub-tree would require exhaustively walking that sub-tree.
  
* If a depth is to be stored, each tree would need a pointer to its parent tree so we propogate depth updates when a leaf is split.
+
* If a depth is to be stored, each tree would need a pointer to its parent tree so we propogate depth updates when a leaf is split.  
  
 
* We can avoid an exhaustive walk by storing each exemplar in every tree on its path from root to leaf. We then can re-balance easily by dropping both sub-trees and splitting the given tree anew. This approach has the down side of making inserts and deletes more complex since we have to update the exemplar list of every tree along the path from root to leaf.
 
* We can avoid an exhaustive walk by storing each exemplar in every tree on its path from root to leaf. We then can re-balance easily by dropping both sub-trees and splitting the given tree anew. This approach has the down side of making inserts and deletes more complex since we have to update the exemplar list of every tree along the path from root to leaf.
 
==Full Source Code==
 
For the full source code to the tree built in this tutorial, see [[User:Duyn/BucketKdTree|duyn's Bucket kd-tree]].
 
  
 
==See also==
 
==See also==
* [[Kd-tree]]. For further information on kd-trees.
+
* [[Kd-tree]] for further information on kd-trees.
* [http://www.autonlab.org/autonweb/14665 Andrew Moore, 'An intoductory tutorial on kd']. A good introduction to kd-trees.
+
* Andrew Moore, ''[http://www.autonlab.org/autonweb/14665 An intoductory tutorial on kd]''. A good introduction to kd-trees.
* [http://ilpubs.stanford.edu:8090/723/ Neal Sample, Matthew Haines, Mark Arnold, Timothy Purcell, 'Optimizing Search Strategies in k-d Trees']. Details some of the optimisations used in this tree.
+
* Neal Sample, Matthew Haines, Mark Arnold, Timothy Purcell, ''[http://ilpubs.stanford.edu:8090/723/ Optimizing Search Strategies in k-d Trees]''. Details some of the optimisations used in this tree.

Latest revision as of 15:43, 27 January 2012

[Work-in-Progress]

This page will walk you through the implementation of a kd-tree. We start with a basic tree and progressively add levels of optimisation. Writing a fully optimised kd-tree is an advanced task, but starting with the basics helps. There are already other kd-trees on this wiki, see some of the others for ideas.

Theory

A k-d tree is a binary tree which successively splits a k-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A k-nearest neighbour search can be implemented from a nearest neighbour algorithm by not shrinking the search radius until k items have been found. This page won't make a distinction between the two.

An Exemplar class

We will call each item in our k-d tree an exemplar. Each exemplar has a domain—its co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.

We might already have a class elsewere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.

public class Exemplar {
  public final double[] domain;

  public Exemplar(double[] domain) {
    this.domain = domain;
  }

  public final boolean
  collocated(final Exemplar other) {
    return Arrays.equals(domain, other.domain);
  }
}

While this class is fully usable as is, rarely will the domain of nearest neighbours be of any interest. Often, useful data (such as GuessFactors) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.

Basic Tree

First, we will build a basic kd-tree as described by Wikipedia's kd-tree page. We start off with a standard binary tree. Each tree is either a node with both left and right sub-trees defined, or a leaf carrying an exemplar. For simplicity, we won't allow nodes to contain any exemplars.

public class BasicKdTree<X extends Exemplar> {
  // Basic tree structure
  X data = null;
  BasicKdTree<X> left = null, right = null;

  // Only need to test one branch since we always populate both
  // branches at once
  private boolean
  isTree() { return left != null; }

  ...
}

Adding

Each of the public API functions defers to a private static method. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.

public void
add(X ex) {
  BasicKdTree.addToTree(this, ex);
}

To add an exemplar, we traverse the tree from top down until we find a leaf. If the leaf doesn't already have an exemplar, we can just stow it there. Otherwise, we split the leaf and put the free exemplars in each sub-tree.

To split a leaf, we cycle through split dimensions in order as in most descriptions of kd-trees. Because we only allow each leaf to hold a single exemplar, we can use a simple splitting strategy: smallest on the left, largest on the right. The split value is half way between the two points along the split dimension.

int splitDim = 0;
double split = Double.NaN;

private static <X extends Exemplar> void
addToTree(BasicKdTree<X> tree, X ex) {
  while(tree != null) {
    if (tree.isTree()) {
      // Traverse in search of a leaf
      tree = ex.domain[tree.splitDim] <= tree.split
        ? tree.left : tree.right;
    } else {
      if (tree.data == null) {
        tree.data = ex;
      } else {
        // Split tree and add
        // Find smallest exemplar to be our split point
        final int d = tree.splitDim;
        X leftX = ex, rightX = tree.data;
        if (rightX.domain[d] < leftX.domain[d]) {
          leftX = tree.data;
          rightX = ex;
        }
        tree.split = 0.5*(leftX.domain[d] + rightX.domain[d]);
        
        final int nextSplitDim =
          (tree.splitDim + 1)%tree.dimensions();
        
        tree.left = new BasicKdTree<X>();
        tree.left.splitDim = nextSplitDim;
        tree.left.data = leftX;
        
        tree.right = new BasicKdTree<X>();
        tree.right.splitDim = nextSplitDim;
        tree.right.data = rightX;
      }

      // Done.
      tree = null;
    }
  }
}

private int
dimensions() { return data.domain.length; }

Searching

Before we start coding a search method, we need a helper class to store search results along with their distance from the query point. This is called PrioNode because we will eventually re-use it to implement a custom priority queue.

public final class PrioNode<T> {
  public final double priority;
  public final T data;

  public PrioNode(double priority, T data) {
    this.priority = priority;
    this.data = data;
  }
}

Like with our add() method, our search() method delegates to a static method to avoid introducing bugs by accidentally referring to member variables while we descend the tree.

public Iterable<? extends PrioNode<X>>
search(double[] query, int nResults) {
  return BasicKdTree.search(this, query, nResults);
}

To do a nearest neighbours search, we walk the tree, preferring to search sub-trees which are on the same side of the split as the query point first. Once we have found our initial candidates, we can contract our search sphere. We only search the other sub-tree if our search sphere might spill over onto the other side of the split.

Results are collected in a java.util.PriorityQueue so we can easily remove the farthest exemplars as we find closer ones.

private static <X extends Exemplar> Iterable<? extends PrioNode<X>>
search(BasicKdTree<X> tree, double[] query, int nResults) {
  final Queue<PrioNode<X>> results =
    new PriorityQueue<PrioNode<X>>(nResults,
      new Comparator<PrioNode<X>>() {

        // min-heap
        public int
        compare(PrioNode<X> o1, PrioNode<X> o2) {
          return o1.priority == o2.priority ? 0
            : o1.priority > o2.priority ? -1
            : 1;
        }

      }
    );
  final Deque<BasicKdTree<X>> stack
    = new LinkedList<BasicKdTree<X>>();
  stack.addLast(tree);
  while (!stack.isEmpty()) {
    tree = stack.removeLast();
    
    if (tree.isTree()) {
      // Guess nearest tree to query point
      BasicKdTree<X> nearTree = tree.left, farTree = tree.right;
      if (query[tree.splitDim] > tree.split) {
        nearTree = tree.right;
        farTree = tree.left;
      }
      
      // Only search far tree if our search sphere might
      // overlap with splitting plane
      if (results.size() < nResults
        || sq(query[tree.splitDim] - tree.split)
          <= results.peek().priority)
      {
        stack.addLast(farTree);
      }

      // Always search the nearest branch
      stack.addLast(nearTree);
    } else {
      final double dSq = distanceSqFrom(query, tree.data.domain);
      if (results.size() < nResults
        || dSq < results.peek().priority)
      {
        while (results.size() >= nResults) {
          results.poll();
        }

        results.offer(new PrioNode<X>(dSq, tree.data));
      }
    }
  }
  return results;
}
 
private static double
sq(double n) { return n*n; }

Our distance calculation is optimised because it will be called often. We use squared distances to avoid an unnecessary sqrt() operation. We also don't use Math.pow() because the JRE's default implementation must do some extra work before it can return with a simple multiplication (see the source in fdlibm).

private static double
distanceSqFrom(double[] p1, double[] p2) {
  // Note: profiling shows this is called lots of times, so it pays
  // to be well optimised
  double dSq = 0;
  for(int d = 0; d < p1.length; d++) {
    final double dst = p1[d] - p2[d];
    if (dst != 0)
      dSq += dst*dst;
  }
  return dSq;
}

And that's it. Barely 200 lines of code and we have ourselves a working kd-tree.

Evaluation

As a guide to performance, we will use the k-NN algorithm benchmark with the Diamond vs CunobelinDC gun data. For comparison, the benchmark included an optimised linear search to represent the lower bound of performance, and Rednaxela's tree—credited to be the fastest tree on the wiki. Full results are near the bottom of the page. Here are the relevant extracts for our tree:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

This test run showed that average add time was close to a linear search while searching performance improved by (0.3939/0.5740 - 1) ~= 31%. It's still an order of magnitude slower than Rednaxela's included tree, which shows there is a lot to gain by selecting appropriate optimisations.

The following code was used in the benchmark:

  • KNNBenchmark.java
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
  final Class<?>[] searchAlgorithms = new Class<?>[] {
    FlatKNNSearch.class,
    SimontonTreeKNNSearch.class,
    NatTreeKNNSearch.class,
    VoidiousTreeKNNSearch.class,
+   DuynBasicKNNSearch.class,
    RednaxelaTreeKNNSearch.class,
  };
  ...
}
 
  • DuynBasicKNNSearch.java
public class DuynBasicKNNSearch extends KNNImplementation {
  final BasicKdTree<StringExemplar> tree;
  public DuynBasicKNNSearch(final int dimension) {
    super(dimension);
    tree = new BasicKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's basic kd-tree";
  }

  @Override public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

Optimisations to the Tree

In this section, we will introduce the following optimisations to our tree:

  • Bucket leaves. Leaves will store multiple exemplars, only being split when they overflow. This makes our split points less dependent on the order in which we receive points and gives us more opportunity to choose a better split point.
  • Bounds-overlap-ball testing. Each tree stores the bounds of the smallest hyperrect which will contain all the data in that tree. This is smaller—maybe significantly so—than the bounds which that sub-tree occupy, giving us more opportunities to skip sub-trees. We will check that the minimum distance from each tree's content bounds overlaps with our search hypersphere. This lets us avoid walking all the way to a leaf if a tree's content bounds overlap with our search hypersphere but none of its children's content bounds do.
  • Splitting choice. We will split on the mean of the dimension with the largest variance. Splitting on the mean does not guarantee that our tree will be balanced, but is easier for us since we would have already calculated the mean in finding the dimension with largest variance. This will add more overhead to our add times since we will compute running means and variances each time an exemplar is added.
  • Singularity tracking. We will actively check whether all the exemplars in a leaf have the same domain. This is necessary to avoid getting ourselves caught in an infinite loop trying to repeatedly split an unsplittable leaf. It also lets us avoid repeated distance calculations when searching for nearest neighbours. Practically, unless all your dimensions are limited, discrete values, this optimisation is unlikely to make much impact.

Basic Tree

Once again, we start off with a basic binary tree. Each tree must store its own bucket size since it will not have a reference to its parent Bucket size is not static because we might want to build multiple trees with different bucket sizes.

public final class OptKdTree<X extends Exemplar> {
  final Queue<X> data;
  OptKdTree<X> left = null, right = null;

  // Number of exemplars to hold in a leaf before splitting
  private final int bucketSize;
  private static final int DEFAULT_BUCKET_SIZE = 10;

  public OptKdTree() {
    this(DEFAULT_BUCKET_SIZE);
  }

  public OptKdTree(int bucketSize) {
    this.bucketSize = bucketSize;
    this.data = new LinkedList<X>();
  }

  private final boolean
  isTree() { return left != null; }

  [...]
}

Adding

Now that our code is a little more sophisticated, it becomes profitable to have a dedicated method to bulk-load our tree. This also makes it easier to add dynamic rebalance support later if we need it. To take advantage the additional information we get from bulk loading, we decide whether to split a leaf only after all exemplars have been added.

public void
add(X ex) {
  OptKdTree<X> tree = addNoSplit(this, ex);
  if (shouldSplit(tree)) {
    split(tree);
  }
}

public void
addAll(Collection<X> exs) {
  final Set<OptKdTree<X>> modTrees =
    new HashSet<OptKdTree<X>>();
  for(X ex : exs) {
    modTrees.add(addNoSplit(this, ex));
  }

  for(OptKdTree<X> tree : modTrees) {
    if (shouldSplit(tree)) {
      split(tree);
    }
  }
}

To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf. For now, there is no special ordering to the list of exemplars in a leaf. To decide which sub-tree to traverse, each tree stores two values—a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.

Since adding takes little time compared to searching, we take this opportunity to make some optimisations:

  • We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it has no chance of containing any points within our search sphere. This hyperrect is fully defined by just storing its two extreme corners.
  • To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using Welford's method:
   <math>
   M_1 = x_1,\qquad S_1 = 0</math>
   <math>
   M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
   <math>
   S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
  • We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
// These aren't initialised until add() is called.
private double[] exMean = null, exSumSqDev = null;

// Optimisation when sub-tree contains only duplicates
private boolean singularity = true;

// Optimisation for searches. This lets us skip a node if its
// scope intersects with a search hypersphere but it doesn't contain
// any points that actually intersect.
private double[] contentMax = null, contentMin = null;

private int
dimensions() { return contentMax.length; }

// Addition

// Adds an exemplar without splitting overflowing leaves.
// Returns leaf to which exemplar was added.
private static <X extends Exemplar> OptKdTree<X>
addNoSplit(OptKdTree<X> tree, X ex) {
  // Some spurious function calls. Optimised for readability over
  // efficiency.
  OptKdTree<X> cursor = tree;
  while (cursor != null) {
    updateBounds(cursor, ex);
    if (cursor.isTree()) {
      // Sub-tree
      cursor = ex.domain[cursor.splitDim] <= cursor.split
        ? cursor.left : cursor.right;
    } else {
      // Leaf

      // Add exemplar to leaf
      cursor.data.add(ex);

      // Calculate running mean and sum of squared deviations
      final int nExs = cursor.data.size();
      final int dims = cursor.dimensions();
      if (nExs == 1) {
        cursor.exMean = Arrays.copyOf(ex.domain, dims);
        cursor.exSumSqDev = new double[dims];
      } else {
        for(int d = 0; d < dims; d++) {
          final double coord = ex.domain[d];
          final double oldMean = cursor.exMean[d], newMean;
          cursor.exMean[d] = newMean =
            oldMean + (coord - oldMean)/nExs;
          cursor.exSumSqDev[d] = cursor.exSumSqDev[d]
            + (coord - oldMean)*(coord - newMean);
        }
      }

      // Check that data are still uniform
      if (cursor.singularity) {
        final Queue<X> cExs = cursor.data;
        if (cExs.size() > 0 && !ex.collocated(cExs.peek()))
          cursor.singularity = false;
      }

      // Finished walking
      return cursor;
    }
  }
  throw new RuntimeException("Walked tree without adding anything");
}

To update the bounding hyperrect for the contents of a tree, we iterate through each dimension, extending (if necessary) the bounds in that dimension to contain the new exemplar.

private static <T extends Exemplar> void
updateBounds(OptKdTree<T> tree, Exemplar ex) {
  final int dims = ex.domain.length;
  if (tree.contentMax == null) {
    tree.contentMax = Arrays.copyOf(ex.domain, dims);
    tree.contentMin = Arrays.copyOf(ex.domain, dims);
  } else {
    for(int d = 0; d < dims; d++) {
      final double dimVal = ex.domain[d];
      if (dimVal > tree.contentMax[d])
        tree.contentMax[d] = dimVal;
      else if (dimVal < tree.contentMin[d])
        tree.contentMin[d] = dimVal;
    }
  }
}

Splitting

We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.

private static <T extends Exemplar> boolean
shouldSplit(OptKdTree<T> tree) {
  return tree.data.size() > tree.bucketSize
    && !tree.singularity;
}

Thanks to our pre-computation, splitting is usually straight-forward. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension. This strategy might not successfully split the tree if exemplars are so close together that rounding errors in computing the mean result in the mean not lying strictly between the exemplars in a leaf. When this happens, we resort to repeatedly trying a dimension and an exemplar at random for our splitting point. We only call split if the leaf is not a singularity, so eventually we will find a point that does divide our exemplars.

To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. Once we have found a successful split point, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.

// Split properties. Not initialised until split occurs.
private int splitDim = 0;
private double split = Double.NaN;

@SuppressWarnings("unchecked") private static <T extends Exemplar> void
split(OptKdTree<T> tree) {
  assert !tree.singularity;
  // Find dimension with largest variance to split on
  double largestVar = -1;
  int splitDim = 0;
  for(int d = 0; d < tree.dimensions(); d++) {
    // Don't need to divide by number of data to find largest
    // variance
    final double var = tree.exSumSqDev[d];
    if (var > largestVar) {
      largestVar = var;
      splitDim = d;
    }
  }

  // Find mean as position for our split
  double splitValue = tree.exMean[splitDim];

  // Check that our split actually splits our data. This also lets
  // us bulk load data into sub-trees, which is more likely
  // to keep optimal balance.
  final Queue<T> leftExs = new LinkedList<T>();
  final Queue<T> rightExs = new LinkedList<T>();
  for(T s : tree.data) {
    if (s.domain[splitDim] <= splitValue)
      leftExs.add(s);
    else
      rightExs.add(s);
  }
  int leftSize = leftExs.size();
  final int treeSize = tree.data.size();
  if (leftSize == treeSize || leftSize == 0) {
    System.err.println(
      "WARNING: Randomly splitting non-uniform tree");
    // We know the data aren't all the same, so try picking
    // an exemplar and a dimension at random for our split point

    // This might take several tries, so we copy our data to
    // an array to speed up process of picking a random point
    Object[] exs = tree.data.toArray();
    while (leftSize == treeSize || leftSize == 0) {
      leftExs.clear();
      rightExs.clear();

      splitDim = (int)
        Math.floor(Math.random()*tree.dimensions());
      final int splitPtIdx = (int)
        Math.floor(Math.random()*exs.length);
      // Cast is inevitable consequence of java's inability to
      // create a generic array
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
      for(T s : tree.data) {
        if (s.domain[splitDim] <= splitValue)
          leftExs.add(s);
        else
          rightExs.add(s);
      }
      leftSize = leftExs.size();
    }
  }

  // We have found a valid split. Start building our sub-trees
  final OptKdTree<T> left = new OptKdTree<T>(tree.bucketSize);
  final OptKdTree<T> right = new OptKdTree<T>(tree.bucketSize);
  left.addAll(leftExs);
  right.addAll(rightExs);

  // Finally, commit the split
  tree.splitDim = splitDim;
  tree.split = splitValue;
  tree.left = left;
  tree.right = right;

  // Let go of data (and their running stats) held in this leaf
  tree.data.clear();
  tree.exMean = tree.exSumSqDev = null;
}

Distance calculations

Before we can start searching, we need to define our distance metric. The distance calculation has already been introduced before.

Squared Euclidean distance:
<math>distance_{x\rightarrow y}^2 = \sum (x[d] - y[d])^2</math>

This will be called for every point examined, so we should optimise it well.

private static double
distanceSqFrom(double[] p1, double[] p2) {
  double dSq = 0;
  for(int d = 0; d < p1.length; d++) {
    final double dst = p1[d] - p2[d];
    if (dst != 0)
      dSq += dst*dst;
  }
  return dSq;
}

We will also need to find the minimum distance from a hyperrect to a query point. This will let us rule out trees whose content hyperrects are outside our search sphere. To find the minimum distance between a hyperrect and a given point, we need first find the closest point along bounds of that hyperrect to the point. This is easy to do, even if not immediately obvious.

Along each dimension,
<math>nearestPoint[d]=\begin{cases}

min[d] & target[d]\le min[d]\\ target[d] & min<target[d]<max[d]\\ max[d] & target[d]\ge max[d]\end{cases}</math>

After that, we just calculate the distance from our query point to this nearest point. The code below has been optimised because it gets called a lot during a search.

// Gets distance from target of nearest point on hyper-rect defined
// by supplied min and max bounds
private static double
minDistanceSqFrom(double[] target, double[] min, double[] max) {
  double distanceSq = 0;
  for(int d = 0; d < target.length; d++) {
    if (target[d] < min[d]) {
      final double dst = min[d] - target[d];
      distanceSq += dst*dst;
    } else if (target[d] > max[d]) {
      final double dst = max[d] - target[d];
      distanceSq += dst*dst;
    }
  }
  return distanceSq;
}

Searching

Before we dive into searching, we will introduce two helper classes.

  • SearchStackEntry stores a tree along any data which will be useful to keep on the search stack. Currently, this additional information consists only of whether a bounds check should be done. This lets us skip a redundant bounds check on the root of the tree.
//
// class SearchStackEntry
//

private static class SearchStackEntry<T extends Exemplar> {
  public final boolean needBoundsCheck;
  public final OptKdTree<T> tree;

  public SearchStackEntry(boolean needBoundsCheck, OptKdTree<T> tree) {
    this.needBoundsCheck = needBoundsCheck;
    this.tree = tree;
  }
}
  • SearchState stores the current state of the search. Its sole purpose is to let us separate searching sub-trees and leaves into different methods. This breaks up a long method and makes profiling easier.
//
// class SearchState
//
// Holds data about current state of the search. Used for live updating
// of pruning distance.

private static class SearchState<X extends Exemplar> {
  final int nResults;
  final PriorityQueue<PrioNode<X>> results;

  public SearchState(int nResults) {
    this.nResults = nResults;
    results = new PriorityQueue<PrioNode<X>>(nResults,
      new Comparator<PrioNode<X>>() {

        public int
        compare(PrioNode<X> o1, PrioNode<X> o2) {
          return o1.priority == o2.priority ? 0
            : o1.priority > o2.priority ? -1 : 1;
        }

      });
  }
}

Our search method does a depth-first search using a stack. Each time we pop our stack, we do a bounds check if necessary, then defer to the appropriate method.

public Iterable<PrioNode<X>>
search(double[] query, int nResults) {
  // Forward to a static method to avoid accidental reference to
  // instance variables while descending the tree
  return search(this, query, nResults);
}

private static <X extends Exemplar> Iterable<PrioNode<X>>
search(OptKdTree<X> tree, double[] query, int nResults) {
  final SearchState<X> state = new SearchState<X>(nResults);
  final Deque<SearchStackEntry<X>> stack =
    new LinkedList<SearchStackEntry<X>>();
  if (tree.contentMin != null)
    stack.addLast(new SearchStackEntry<X>(false, tree));
  while (!stack.isEmpty()) {
    final SearchStackEntry<X> entry = stack.removeLast();
    final OptKdTree<X> cur = entry.tree;

    if (entry.needBoundsCheck && state.results.size() >= nResults) {
      final double d = minDistanceSqFrom(query,
        cur.contentMin, cur.contentMax);
      if (d > state.results.lastKey())
        continue;
    }

    if (cur.isTree()) {
      searchTree(query, cur, stack, state.results.peek().priority);
    } else {
      searchLeaf(query, cur, state);
    }
  }

  return state.results;
}

Searching a Tree

We basically want to add both sub-trees to the stack, if they are not empty. As a heuristic, we think the sub-tree on the same side of the split as our query point is more likely to contain closer points, so we search that sub-tree first by adding it to the stack last. You might try to avoid a distance calculation by not requiring a bounds check for the heuristically nearer sub-tree, but that does not seem to significantly improve performance.

private static <X extends Exemplar> void
searchTree(double[] query, OptKdTree<X> tree,
  Deque<SearchStackEntry<X>> stack, double maxPriority)
{
  OptKdTree<X> nearTree = tree.left, farTree = tree.right;
  if (query[tree.splitDim] > tree.split) {
    nearTree = tree.right;
    farTree = tree.left;
  }

  // These variables let us skip empty sub-trees
  boolean nearEmpty = nearTree.contentMin == null;
  boolean farEmpty = farTree.contentMin == null;

  // Add nearest sub-tree to stack later so we descend it
  // first. This is likely to constrict our max distance
  // sooner, resulting in less visited nodes
  if (!farEmpty && minDistanceSqFrom(query,farTree.contentMin,farTree.contentMax) < maxPriority) {
    stack.addLast(new SearchStackEntry<X>(true, farTree));
  }

  if (!nearEmpty && minDistanceSqFrom(query,nearTree.contentMin,nearTree.contentMax) < maxPriority) {
    stack.addLast(new SearchStackEntry<X>(true, nearTree));
  }
}

Searching a Leaf

By the time we reach a leaf, there is little left to optimise away. We must go through each exemplar in turn to see if it is a nearest neighbour to our query point.

private static <X extends Exemplar> void
searchLeaf(double[] query, OptKdTree<X> leaf, SearchState<X> state) {
  double exD = Double.NaN;
  for(X ex : leaf.data) {
    if (!leaf.singularity || Double.isNaN(exD)) {
      exD = distanceSqFrom(query, ex.domain);
    }

    if (examine(exD, state)) {
      while (state.results.size() >= state.nResults) {
        state.results.poll();
      }

      state.results.offer(new PrioNode<X>(exD, ex));
    }
  }
}

private static <X extends Exemplar> boolean
examine(double distance, SearchState<X> state) {
  return state.results.size() < state.nResults
    || distance < state.results.peek().priority;
}

Evaluation

From the same benchmark run, our optimised tree obtains the following results:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #4 duyn's optimised kd-tree [0.0536]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

This is promising. We are a whole order of magnitude faster and finally in the same league as other kd-tree implementations on this wiki. We have not changed our API, so the following shim is only for completeness:

  • DuynOptKNNSearch.java
public class DuynOptKNNSearch extends KNNImplementation {
  final OptKdTree<StringExemplar> tree;
  public DuynOptKNNSearch(final int dimension) {
    super(dimension);
    tree = new OptKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's optimised kd-tree";
  }

  @Override
  public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

A Better Results Heap

A TreeMap does unnecessary processing, keeping all its keys in order when all we really care about is the farthest tentative result we have found. By implementing a custom heap, we can squeeze even more performance out of our tree.

In this section, we will implement an implicit binary heap. This structure does not have the best theoretical performance, but often does well in practice. More theoretically efficient heaps often rely on batching inserts, only fully processing them when a the top element is removed. Since we almost do one removal for every insert, this does not offer much benefit to us. For our usage pattern, the only heap faster than a binary heap is a buggy heap.

Implicit d-ary Heap

Implicit binary heaps can be very memory efficient since all elements are stored in an array. They don't grow as easily as other heaps, but for our purposes that is not a problem—we can set a max on the number of results we are interested in. A good description of implicit heaps is provided by John Bentley in Programming Pearls: Thanks, Heaps.

Below, we implement a d-ary heap. The math is similar to a binary heap, and there are some suggestions that a 4-way heap may be more efficient than a binary heap. Our heap can be used as a min- or a max-heap. Using the enum Direction ensures the user doesn't get this parameter wrong.

// Final for performance, may be removed if sub-classing necessary
public final class FastBinaryHeap<T> implements Iterable<PrioNode<T>> {
  private final int branches;
  // Direction = -1 or 1
  // root * direction > element * direction
  private final int direction;
  public static enum Direction {
    MIN(-1), MAX(1);

    public final int value;
    private Direction(int value) {
      this.value = value;
    }
  }
  public static final Direction MAX = Direction.MAX, MIN = Direction.MIN;

  private final PrioNode<T>[] data;
  private int size = 0;
  private static final int DEFAULT_N_CHILDREN = 4;

  public FastBinaryHeap(int capacity, Direction direction) {
    this(capacity, DEFAULT_N_CHILDREN, direction);
  }

  @SuppressWarnings("unchecked")
  public FastBinaryHeap(int capacity, int branches, Direction dir) {
    this.branches = branches;
    // This implicit conversion is safe because we only refer to
    // this array through our data member.
    this.data = new PrioNode[capacity];
    this.direction = dir.value;
  }
}

We implement an interface similar to the standard PriorityQueue:

public Iterator<PrioNode<T>>
iterator() {
  return new Iterator<PrioNode<T>>() {

    int idx = 0;

    public boolean
    hasNext() { return idx < size; }

    public PrioNode<T>
    next() { return data[idx++]; }

    public void
    remove() {
      throw new UnsupportedOperationException("Not supported.");
    }

  };
}

public int
size() { return size; }

public PrioNode<T>
peek() { return data[0]; }

Our offer() method takes a priority separately from the data we want to associate with it. We will store the priority separately so we don't fall into the same performance trap as PriorityQueue. We do not implement a poll() method because we don't need it. If we are about to overflow, we will just replace our top element with the new one.

public boolean
offer(double priority, T result) {
  if (size < data.length) {
    // Trivial case
    siftUp(size, new PrioNode<T>(priority, result));
    size++;
    return true;
  }

  if (priority*direction > data[0].priority*direction) {
    System.err.printf("Do not want %f when have %f\n",
      priority, data[0].priority);
    return false;
  }

  siftDown(0, new PrioNode<T>(priority, result));
  return true;
}

Now to the meat of our heap.

  • siftUp treats the given index as empty and progressively movies parents into the hole left behind if inserting the given element there would violate heap order.
Parents are calculated as follows:
<math>parent[i] = \left\lfloor {i - 1\over branches} \right\rfloor</math>
Once we have that, we can compare the values at both indexes. Multiplying them by direction lets us use the same code for both a min- and a max-heap.
private void
siftUp(int hole, PrioNode<T> value) {
  if (size > 0) {
    final double vPD = value.priority*direction;
    while (hole > 0) {
      final int parent = (hole - 1)/branches;
      if (vPD > data[parent].priority*direction) {
        data[hole] = data[parent];
        hole = parent;
      } else {
        break;
      }
    }
  }

  data[hole] = value;
}
  • siftDown does the opposite, starting from the given index and moving elements up if inserting the new value into the hole would violate heap order. It swaps the element with its worst heap-offending child so that with a single swap, it can restore heap order to that level.
In a 0-based index, children are calculated as follows:
<math>\begin{align}child_{1}[i] & =branches\times i+1\\

\ldots\\ child_{branches}[i] & =branches\times i+branches\end{align}</math>

private void
siftDown(int hole, PrioNode<T> value) {
  if (size > 0) {
    // Push down along min path
    final double vPD = value.priority*direction;
    int firstChild;
    while ((firstChild = branches*hole + 1) < data.length) {

      // Find largest/smallest child
      double swapPD = data[firstChild].priority*direction;
      int swapChild = firstChild;
      for(int c = firstChild + 1;
        c < firstChild + branches && c < size; c++)
      {
        final double cPD = data[c].priority*direction;
        if (cPD > swapPD) {
          swapPD = cPD;
          swapChild = c;
        }
      }

      if (swapPD > vPD) {
        data[hole] = data[swapChild];
        hole = swapChild;
      } else {
        break;
      }
    }
  }

  data[hole] = value;
}

Both of these are slightly different from standard textbook presentations of the algorithm. The versions here do less swaps by shifting other elements out of the way until the new element's rightful place has been found.

Changes to our Tree

Because the only change we have made is to the method of collecting results, we will skim over the changes necessary.

public final class FastKdTree<X extends Exemplar> {
....
}
  • Our SearchState will have to be updated to use our new heap:
//
// class SearchState
//
// Holds data about current state of the search. Used for live updating
// of pruning distance.

private static class SearchState<X extends Exemplar> {
  final int nResults;
  final FastBinaryHeap<X> results;

  public SearchState(int nResults) {
    this.nResults = nResults;
    results = new FastBinaryHeap<X>(
      nResults, 4, FastBinaryHeap.MAX);
  }
}
  • Our searchLeaf() method no longer needs to explicitly poll the top since our heap will just replace it if it has no space:
private static <X extends Exemplar> void
searchLeaf(double[] query, FastKdTree<X> leaf, SearchState<X> state) {
  double exD = Double.NaN;
  for(X ex : leaf.data) {
    if (!leaf.singularity || Double.isNaN(exD)) {
      exD = distanceSqFrom(query, ex.domain);
    }

    if (examine(exD, state)) {
      state.results.offer(exD, ex);
    }
  }
}
  • We have also replaced all instances of LinkedLists with ArrayDeques. This is because although theoretically LinkedLists are appropriate, the javadocs for ArrayDeque state that:
"This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue."—Java 6 API docs
public FastKdTree(int bucketSize) {
  this.bucketSize = bucketSize;
  this.data = new ArrayDeque<X>();
}

@SuppressWarnings("unchecked") private static <X extends Exemplar> void
split(FastKdTree<X> tree) {
  ...
  final Queue<X> leftExs = new ArrayDeque<X>();
  final Queue<X> rightExs = new ArrayDeque<X>();
  ...
}

private static <X extends Exemplar> Iterable<PrioNode<X>>
search(FastKdTree<X> tree, double[] query, int nResults) {
  final SearchState<X> state = new SearchState<X>(nResults);
  final Deque<SearchStackEntry<X>> stack =
    new ArrayDeque<SearchStackEntry<X>>();
  ...
}

Evaluation

Our improved heap squeezes a bit more performance (~10%) out of our tree:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
: Average searching time       = 0.0479 miliseconds
: Average worst searching time = 25.0995 miliseconds
: Average adding time          = 2.1288 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #3 duyn's fast kd-tree [0.0479]
 - #4 duyn's optimised kd-tree [0.0536]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

Our client code has barely changed:

  • DuynFastKNNSearch.java
public class DuynFastKNNSearch extends KNNImplementation {
  final FastKdTree<StringExemplar> tree;
  public DuynFastKNNSearch(final int dimension) {
    super(dimension);
    tree = new FastKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's fast kd-tree";
  }

  @Override public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

Full Benchmark Results

Below are the full benchmark results quoted throughout this page. It puts the performance of our trees in context compared to other kd-trees available on this wiki. The benchmark (k-NN algorithm benchmark) was run with the Diamond vs CunobelinDC gun data.

K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
-----------------------------------------
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
Read 25621 saves and 10300 searches.

Running 30 repetition(s) for k-nearest neighbours searching:
:: 13 dimension(s); 40 neighbour(s)
Warming up the JIT with 5 repetitions first...

Running tests...test 1/30
[...]
 COMPLETED.

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Average searching time       = 0.1609 miliseconds
: Average worst searching time = 1.4184 miliseconds
: Average adding time          = 1.3322 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Average searching time       = 0.2081 miliseconds
: Average worst searching time = 21.9191 miliseconds
: Average adding time          = 1.0301 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (3rd gen) >>
: Average searching time       = 0.0407 miliseconds
: Average worst searching time = 0.2259 miliseconds
: Average adding time          = 1.6342 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
: Average searching time       = 0.0479 miliseconds
: Average worst searching time = 25.0995 miliseconds
: Average adding time          = 2.1288 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Chase-san's kd-tree (KDTreeC) >>
: Average searching time       = 0.1200 miliseconds
: Average worst searching time = 19.5727 miliseconds
: Average adding time          = 1.4222 microseconds
: Accuracy                     = 100.0%


BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 - #2 Rednaxela's kd-tree (3rd gen) [0.0407]
 - #3 duyn's fast kd-tree [0.0479]
 - #4 duyn's optimised kd-tree [0.0536]
 - #5 Chase-san's kd-tree (KDTreeC) [0.1200]
 - #6 Simonton's Bucket PR k-d tree [0.1609]
 - #7 Voidious' Bucket PR k-d tree [0.2081]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

Benchmark running time: 587.77 seconds

Full Source Code

The full source code for all three trees presented in this tutorial can be downloaded together: File:Duyn tutorial kd trees.jar. The code is available under the RWPCL.

Extensions

This tree does not include any support for deletion or re-balancing. You can get away with this if you plan to re-build the entire tree regularly.

Deletion support would be easy to add to this tree. Since all exemplars are stored in leaves at the bottom, we need only walk the tree and remove the exemplar from the leaf containing it. We would then merge sub-trees if multiple deletions has left one of them empty. Unless our bucket width is really small, we don't need to merge sub-trees after each deletion since the addition of a subsequent exemplar is unlikely to result in choosing a significantly different split point.

Re-balancing with this tree is trickier. It does not currently store depth, so we don't know when a re-balance is needed. Because exemplars are only stored in leaves, collecting all the exemplars under a sub-tree would require exhaustively walking that sub-tree.

  • If a depth is to be stored, each tree would need a pointer to its parent tree so we propogate depth updates when a leaf is split.
  • We can avoid an exhaustive walk by storing each exemplar in every tree on its path from root to leaf. We then can re-balance easily by dropping both sub-trees and splitting the given tree anew. This approach has the down side of making inserts and deletes more complex since we have to update the exemplar list of every tree along the path from root to leaf.

See also