Difference between revisions of "User:Duyn/kd-tree Tutorial"

From Robowiki
Jump to navigation Jump to search
(→‎Assessment: Added performance benchmarks)
(→‎Searching: Update method call for searchTree to include max distance, get rid of (unnecessary) spaghetti code)
 
(55 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[this is the beginning of a tutorial on writing a k-d tree].
+
[Work-in-Progress]
  
This tutorial will walk you through the process of writing a k-d tree for k-nearest neighbour search. The tree here will hold multiple items in each leaf, splitting only when a leaf overflows. It will split on the mean of the dimension with the largest variance. There are already several k-d trees on this wiki, see some of the others for ideas.
+
This page will walk you through the implementation of a ''k''d-tree. We start with a basic tree and progressively add levels of optimisation. Writing a fully optimised ''k''d-tree is an advanced task, but starting with the basics helps. There are already other ''k''d-trees on this wiki, see some of the others for ideas.
  
A k-d tree is a binary tree which successively splits a ''k''-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A ''k''-nearest neighbour search can be implemented in from a nearest neighbour algorithm by not shrinking the search radius until ''k'' items have been found. The rest of this tutorial will refer to both as nearest neighbour queries.
+
==Theory==
 +
A k-d tree is a binary tree which successively splits a ''k''-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A ''k''-nearest neighbour search can be implemented from a nearest neighbour algorithm by not shrinking the search radius until ''k'' items have been found. This page won't make a distinction between the two.
  
 
==An Exemplar class==
 
==An Exemplar class==
We will call each item in our k-d tree an '''exemplar'''. Each exemplar has a '''domain'''—its spatial co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.
+
We will call each item in our k-d tree an '''exemplar'''. Each exemplar has a '''domain'''—its co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.
  
You might already have a class somewhere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.
+
We might already have a class elsewere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.
  
<pre>
+
public class Exemplar {
public class Exemplar {
+
  public final double[] domain;
  public final double[] domain;
+
 +
  public Exemplar(double[] domain) {
 +
    this.domain = domain;
 +
  }
 +
 +
  public final boolean
 +
  collocated(final Exemplar other) {
 +
    return Arrays.equals(domain, other.domain);
 +
  }
 +
}
  
  public Exemplar(final double[] coords) {
+
While this class is fully usable as is, rarely will the domain of nearest neighbours be of any interest. Often, useful data (such as [[GuessFactor]]s) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.
    this.domain = coords;
 
  }
 
  
  // Short hand. Shorter than calling Arrays.equals() each time.
+
==Basic Tree==
  public boolean domainEquals(final Exemplar other) {
+
First, we will build a basic ''k''d-tree as described by Wikipedia's [http://en.wikipedia.org/w/index.php?title=Kd-tree&oldid=346156055 kd-tree] page. We start off with a standard binary tree. Each tree is either a '''node''' with both left and right sub-trees defined, or a '''leaf''' carrying an exemplar. For simplicity, we won't allow nodes to contain any exemplars.
    return Arrays.equals(domain, other.domain);
 
  }
 
}
 
</pre>
 
  
While this class is fully usable as is, rarely will you be interested in just the domain of nearest neighbours in a search. It is expected that specific data (eg. guess factors) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.
+
public class BasicKdTree<X extends Exemplar> {
 +
  // Basic tree structure
 +
  X data = null;
 +
  BasicKdTree<X> left = null, right = null;
 +
 +
  // Only need to test one branch since we always populate both
 +
  // branches at once
 +
  private boolean
 +
  isTree() { return left != null; }
 +
 +
  ...
 +
}
  
==Basic Tree==
+
===Adding===
Here is a basic tree structure:
+
Each of the public API functions defers to a private static method. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.
  
<pre>
+
public void
public class BucketKdTree<T extends Exemplar> {
+
add(X ex) {
  private List<T> exemplars = new LinkedList<T>();
+
  BasicKdTree.addToTree(this, ex);
  private BucketKdTree<T> left, right;
+
}
  private int bucketSize;
 
  private final int dimensions;
 
  
  public BucketKdTree(int bucketSize, int dimensions) {
+
To add an exemplar, we traverse the tree from top down until we find a leaf. If the leaf doesn't already have an exemplar, we can just stow it there. Otherwise, we split the leaf and put the free exemplars in each sub-tree.
    this.bucketSize = bucketSize;
 
    this.dimensions = dimensions;
 
  }
 
  
  private boolean
+
To split a leaf, we cycle through split dimensions in order as in most descriptions of ''k''d-trees. Because we only allow each leaf to hold a single exemplar, we can use a simple splitting strategy: smallest on the left, largest on the right. The split value is half way between the two points along the split dimension.
  isTree() { return left != null; }
+
}
+
int splitDim = 0;
</pre>
+
double split = Double.NaN;
 +
 +
private static <X extends Exemplar> void
 +
addToTree(BasicKdTree<X> tree, X ex) {
 +
  while(tree != null) {
 +
    if (tree.isTree()) {
 +
      // Traverse in search of a leaf
 +
      tree = ex.domain[tree.splitDim] <= tree.split
 +
        ? tree.left : tree.right;
 +
    } else {
 +
      if (tree.data == null) {
 +
        tree.data = ex;
 +
      } else {
 +
        // Split tree and add
 +
        // Find smallest exemplar to be our split point
 +
        final int d = tree.splitDim;
 +
        X leftX = ex, rightX = tree.data;
 +
        if (rightX.domain[d] < leftX.domain[d]) {
 +
          leftX = tree.data;
 +
          rightX = ex;
 +
        }
 +
        tree.split = 0.5*(leftX.domain[d] + rightX.domain[d]);
 +
       
 +
        final int nextSplitDim =
 +
          (tree.splitDim + 1)%tree.dimensions();
 +
       
 +
        tree.left = new BasicKdTree<X>();
 +
        tree.left.splitDim = nextSplitDim;
 +
        tree.left.data = leftX;
 +
       
 +
        tree.right = new BasicKdTree<X>();
 +
        tree.right.splitDim = nextSplitDim;
 +
        tree.right.data = rightX;
 +
      }
 +
 +
      // Done.
 +
      tree = null;
 +
    }
 +
  }
 +
}
 +
 +
private int
 +
dimensions() { return data.domain.length; }
  
Each tree is either a tree with both left and right sub-trees defined, or a leaf with exemplars filled. Because of our splitting algorithm, it is pointless to allow a tree to be both since the mean might not correspond with any actual exemplars.
+
===Searching===
 +
Before we start coding a search method, we need a helper class to store search results along with their distance from the query point. This is called <code>PrioNode</code> because we will eventually re-use it to implement a custom priority queue.
  
Bucket size and dimensions must be passed into the constructor. We could infer dimension from the dimension of the first point added, but this is simpler. Bucket size is not final because theoretically it could be varied, though our implementation will not.
+
public final class PrioNode<T> {
 +
  public final double priority;
 +
  public final T data;
 +
 +
  public PrioNode(double priority, T data) {
 +
    this.priority = priority;
 +
    this.data = data;
 +
  }
 +
}
  
==Adding==
+
Like with our <code>add()</code> method, our <code>search()</code> method delegates to a static method to avoid introducing bugs by accidentally referring to member variables while we descend the tree.
  
Each of the public API functions defers the actual addition to another private static function. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.
+
public Iterable<? extends PrioNode<X>>
 +
search(double[] query, int nResults) {
 +
  return BasicKdTree.search(this, query, nResults);
 +
}
  
We decide whether to split a leaf only after the add has been completed.
+
To do a nearest neighbours search, we walk the tree, preferring to search sub-trees which are on the same side of the split as the query point first. Once we have found our initial candidates, we can contract our search sphere. We only search the other sub-tree if our search sphere might spill over onto the other side of the split.
  
<pre>
+
Results are collected in a <code>java.util.PriorityQueue</code> so we can easily remove the farthest exemplars as we find closer ones.
// One at a time
 
public void
 
add(T ex) {
 
  BucketKdTree<T> tree = addNoSplit(this, ex);
 
  if (shouldSplit(tree)) {
 
    split(tree);
 
  }
 
}
 
  
// Bulk add gives us more data to choose a better split point
+
private static <X extends Exemplar> Iterable<? extends PrioNode<X>>
public void
+
search(BasicKdTree<X> tree, double[] query, int nResults) {
addAll(Collection<T> exs) {
+
  final Queue<PrioNode<X>> results =
  // Some spurious function calls. Optimised for readability over
+
    new PriorityQueue<PrioNode<X>>(nResults,
  // efficiency.
+
      new Comparator<PrioNode<X>>() {
  final Set<BucketKdTree<T>> modTrees =
+
    new HashSet<BucketKdTree<T>>();
+
        // min-heap
  for(T ex : exs) {
+
        public int
    modTrees.add(addNoSplit(this, ex));
+
        compare(PrioNode<X> o1, PrioNode<X> o2) {
   }
+
          return o1.priority == o2.priority ? 0
 +
            : o1.priority > o2.priority ? -1
 +
            : 1;
 +
        }
 +
 +
      }
 +
    );
 +
  final Deque<BasicKdTree<X>> stack
 +
    = new LinkedList<BasicKdTree<X>>();
 +
  stack.addLast(tree);
 +
  while (!stack.isEmpty()) {
 +
    tree = stack.removeLast();
 +
   
 +
    if (tree.isTree()) {
 +
      // Guess nearest tree to query point
 +
      BasicKdTree<X> nearTree = tree.left, farTree = tree.right;
 +
      if (query[tree.splitDim] > tree.split) {
 +
        nearTree = tree.right;
 +
        farTree = tree.left;
 +
      }
 +
     
 +
      // Only search far tree if our search sphere might
 +
      // overlap with splitting plane
 +
      if (results.size() < nResults
 +
        || sq(query[tree.splitDim] - tree.split)
 +
          <= results.peek().priority)
 +
      {
 +
        stack.addLast(farTree);
 +
      }
 +
 +
      // Always search the nearest branch
 +
      stack.addLast(nearTree);
 +
    } else {
 +
      final double dSq = distanceSqFrom(query, tree.data.domain);
 +
      if (results.size() < nResults
 +
        || dSq < results.peek().priority)
 +
      {
 +
        while (results.size() >= nResults) {
 +
          results.poll();
 +
        }
 +
 +
        results.offer(new PrioNode<X>(dSq, tree.data));
 +
      }
 +
    }
 +
  }
 +
  return results;
 +
}
 +
    
 +
private static double
 +
sq(double n) { return n*n; }
  
  for(BucketKdTree<T> tree : modTrees) {
+
Our distance calculation is optimised because it will be called often. We use squared distances to avoid an unnecessary <code>sqrt()</code> operation. We also don't use <code>Math.pow()</code> because the JRE's default implementation must do some extra work before it can return with a simple multiplication (see the source in [http://www.netlib.org/fdlibm/e_pow.c fdlibm]).
    if (shouldSplit(tree)) {
 
      split(tree);
 
    }
 
  }
 
}
 
</pre>
 
  
To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf.
+
private static double
 +
distanceSqFrom(double[] p1, double[] p2) {
 +
  // Note: profiling shows this is called lots of times, so it pays
 +
  // to be well optimised
 +
  double dSq = 0;
 +
  for(int d = 0; d < p1.length; d++) {
 +
    final double dst = p1[d] - p2[d];
 +
    if (dst != 0)
 +
      dSq += dst*dst;
 +
  }
 +
  return dSq;
 +
}
  
To decide which sub-tree to traverse, each tree stores two values&mdash;a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.
+
And that's it. Barely 200 lines of code and we have ourselves a working ''k''d-tree.
  
Since adding takes little time compared to searching, we take this opportunity to make some optimisations:
+
===Evaluation===
 +
As a guide to performance, we will use the [[k-NN algorithm benchmark]] with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data]. For comparison, the benchmark included an optimised linear search to represent the lower bound of performance, and Rednaxela's tree&mdash;credited to be the fastest tree on the wiki. Full results are near the bottom of the page. Here are the relevant extracts for our tree:
  
* We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it doesn't actually contain any points within the hyperrect bounding our search sphere. This hyperrect is defined by <code>maxBounds</code> and <code>minBounds</code>.
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
 +
: Average searching time      = 0.5740 miliseconds
 +
: Average worst searching time = 2.3482 miliseconds
 +
: Average adding time          = 0.4528 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
[...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
  
* To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using [http://www.johndcook.com/standard_deviation.html Welford's method]:
+
This test run showed that average add time was close to a linear search while searching performance improved by (0.3939/0.5740 - 1) ~= 31%. It's still an order of magnitude slower than Rednaxela's included tree, which shows there is a lot to gain by selecting appropriate optimisations.
    <math>
 
    M_1 = x_1,\qquad S_1 = 0</math>
 
    <math>
 
    M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
 
    <math>
 
    S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
 
  
* We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
+
The following code was used in the benchmark:
  
The final adding code:
+
* KNNBenchmark.java
<pre>
 
private int splitDim;
 
private double split;
 
  
// These aren't initialised until add() is called.
+
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
private double[] exMean;
+
  final Class<?>[] searchAlgorithms = new Class<?>[] {
private double[] exSumSqDev;
+
    FlatKNNSearch.class,
 +
    SimontonTreeKNNSearch.class,
 +
    NatTreeKNNSearch.class,
 +
    VoidiousTreeKNNSearch.class,
 +
+  DuynBasicKNNSearch.class,
 +
    RednaxelaTreeKNNSearch.class,
 +
  };
 +
  ...
 +
}
 +
 
 +
* DuynBasicKNNSearch.java
  
// Optimisation when tree contains large number of duplicates
+
public class DuynBasicKNNSearch extends KNNImplementation {
private boolean exemplarsAreUniform = true;
+
  final BasicKdTree<StringExemplar> tree;
 +
  public DuynBasicKNNSearch(final int dimension) {
 +
    super(dimension);
 +
    tree = new BasicKdTree<StringExemplar>();
 +
  }
 +
 +
  @Override public void
 +
  addPoint(final double[] location, final String value) {
 +
    tree.add(new StringExemplar(location, value));
 +
  }
 +
 +
  @Override public String
 +
  getName() {
 +
    return "duyn's basic kd-tree";
 +
  }
 +
 +
  @Override public KNNPoint[]
 +
  getNearestNeighbors(final double[] location, final int size) {
 +
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 +
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
 +
      final double distance = sr.priority;
 +
      final StringExemplar pt = sr.data;
 +
      justPoints.add(new KNNPoint(pt.value, distance));
 +
    }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
    return justPoints.toArray(retVal);
 +
  }
 +
 +
  class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
  }
 +
}
  
// Optimisation for searches. This lets us skip a node if its
+
==Optimisations to the Tree==
// scope intersects with a search hypersphere but it doesn't contain
+
In this section, we will introduce the following optimisations to our tree:
// any points that actually intersect.
+
* '''Bucket leaves.''' Leaves will store multiple exemplars, only being split when they overflow. This makes our split points less dependent on the order in which we receive points and gives us more opportunity to choose a better split point.
private double[] maxBounds;
+
* '''Bounds-overlap-ball testing.''' Each tree stores the bounds of the smallest hyperrect which will contain all the data in that tree. This is smaller&mdash;maybe significantly so&mdash;than the bounds which that sub-tree occupy, giving us more opportunities to skip sub-trees. We will check that the minimum distance from each tree's content bounds overlaps with our search hypersphere. This lets us avoid walking all the way to a leaf if a tree's content bounds overlap with our search hypersphere but none of its children's content bounds do.
private double[] minBounds;
+
* '''Splitting choice.''' We will split on the mean of the dimension with the largest variance. Splitting on the mean does not guarantee that our tree will be balanced, but is easier for us since we would have already calculated the mean in finding the dimension with largest variance. This will add more overhead to our <code>add</code> times since we will compute running means and variances each time an exemplar is added.
 +
* '''Singularity tracking.''' We will actively check whether all the exemplars in a leaf have the same domain. This is necessary to avoid getting ourselves caught in an infinite loop trying to repeatedly split an unsplittable leaf. It also lets us avoid repeated distance calculations when searching for nearest neighbours. Practically, unless all your dimensions are limited, discrete values, this optimisation is unlikely to make much impact.
  
// Adds an exemplar without splitting overflowing leaves.
+
===Basic Tree===
// Returns leaf to which exemplar was added.
+
Once again, we start off with a basic binary tree. Each tree must store its own bucket size since it will not have a reference to its parent Bucket size is not static because we might want to build multiple trees with different bucket sizes.
private static <T extends Exemplar> BucketKdTree<T>
 
addNoSplit(BucketKdTree<T> tree, T ex) {
 
  // Some spurious function calls. Optimised for readability over
 
  // efficiency.
 
  BucketKdTree<T> cursor = tree;
 
  while (cursor != null) {
 
    updateBounds(cursor, ex);
 
    if (cursor.isTree()) {
 
      // Sub-tree
 
      cursor = ex.domain[cursor.splitDim] <= cursor.split
 
        ? cursor.left : cursor.right;
 
    } else {
 
      // Leaf
 
      cursor.exemplars.add(ex);
 
      final int nExs = cursor.exemplars.size();
 
      if (nExs == 1) {
 
        cursor.exMean =
 
          Arrays.copyOf(ex.domain, cursor.dimensions);
 
        cursor.exSumSqDev = new double[cursor.dimensions];
 
      } else {
 
        for(int d = 0; d < cursor.dimensions; d++) {
 
          final double coord = ex.domain[d];
 
  
          final double oldExMean = cursor.exMean[d];
+
public final class OptKdTree<X extends Exemplar> {
          final double newMean = cursor.exMean[d] =
+
  final Queue<X> data;
            oldExMean + (coord - oldExMean)/nExs;
+
  OptKdTree<X> left = null, right = null;
 +
 +
  // Number of exemplars to hold in a leaf before splitting
 +
  private final int bucketSize;
 +
  private static final int DEFAULT_BUCKET_SIZE = 10;
 +
 +
  public OptKdTree() {
 +
    this(DEFAULT_BUCKET_SIZE);
 +
  }
 +
 +
  public OptKdTree(int bucketSize) {
 +
    this.bucketSize = bucketSize;
 +
    this.data = new LinkedList<X>();
 +
  }
 +
 +
  private final boolean
 +
  isTree() { return left != null; }
 +
 +
  [...]
 +
}
  
          final double oldSumSqDev = cursor.exSumSqDev[d];
+
===Adding===
          cursor.exSumSqDev[d] = oldSumSqDev
+
Now that our code is a little more sophisticated, it becomes profitable to have a dedicated method to bulk-load our tree. This also makes it easier to add dynamic rebalance support later if we need it. To take advantage the additional information we get from bulk loading, we decide whether to split a leaf only after all exemplars have been added.
            + (coord - oldExMean)*(coord - newMean);
 
        }
 
      }
 
      if (cursor.exemplarsAreUniform) {
 
        final List<T> cExs = cursor.exemplars;
 
        if (cExs.size() > 0 && !ex.domainEquals(cExs.get(0)))
 
          cursor.exemplarsAreUniform = false;
 
      }
 
      return cursor;
 
    }
 
  }
 
  throw new RuntimeException("Walked tree without adding anything");
 
}
 
  
private static <T extends Exemplar> void
+
public void
updateBounds(BucketKdTree<T> tree, Exemplar ex) {
+
add(X ex) {
  final int dims = tree.dimensions;
+
  OptKdTree<X> tree = addNoSplit(this, ex);
  if (tree.maxBounds == null) {
+
  if (shouldSplit(tree)) {
    tree.maxBounds = Arrays.copyOf(ex.domain, dims);
+
    split(tree);
    tree.minBounds = Arrays.copyOf(ex.domain, dims);
+
  }
  } else {
+
}
    for(int d = 0; d < dims; d++) {
+
      final double dimVal = ex.domain[d];
+
public void
      if (dimVal > tree.maxBounds[d])
+
addAll(Collection<X> exs) {
        tree.maxBounds[d] = dimVal;
+
  final Set<OptKdTree<X>> modTrees =
      else if (dimVal < tree.minBounds[d])
+
    new HashSet<OptKdTree<X>>();
        tree.minBounds[d] = dimVal;
+
  for(X ex : exs) {
    }
+
    modTrees.add(addNoSplit(this, ex));
  }
+
  }
}
+
</pre>
+
  for(OptKdTree<X> tree : modTrees) {
 +
    if (shouldSplit(tree)) {
 +
      split(tree);
 +
    }
 +
  }
 +
}
  
==Splitting==
+
To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf. For now, there is no special ordering to the list of exemplars in a leaf. To decide which sub-tree to traverse, each tree stores two values&mdash;a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.
We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.
 
  
<pre>
+
Since adding takes little time compared to searching, we take this opportunity to make some optimisations:
private static <T extends Exemplar> boolean
 
shouldSplit(BucketKdTree<T> tree) {
 
  return tree.exemplars.size() > tree.bucketSize
 
    && !tree.exemplarsAreUniform;
 
}
 
</pre>
 
  
Thanks to our pre-computation, splitting is straight-forward&mdash;most of the time. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension.
+
* We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it has no chance of containing any points within our search sphere. This hyperrect is fully defined by just storing its two extreme corners.
  
To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. If all the exemplars end up in only one of the lists, then our split point has failed to actually separate our exemplars. This is most likely due to rounding error when our exemplars are really close together. At a loss for what to do, we simply pick a random point and a random dimension until we find something that parts our exemplars. We know we must find one eventually because our exemplars are not uniform&mdash;at least one of them is smaller in at least one dimension than all the others.
+
* To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using [http://www.johndcook.com/standard_deviation.html Welford's method]:
 +
    <math>
 +
    M_1 = x_1,\qquad S_1 = 0</math>
 +
    <math>
 +
    M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
 +
    <math>
 +
    S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
  
Finally, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.
+
* We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
  
<pre>
+
// These aren't initialised until add() is called.
@SuppressWarnings("unchecked") private static <T extends Exemplar> void
+
private double[] exMean = null, exSumSqDev = null;
split(BucketKdTree<T> tree) {
+
  assert !tree.exemplarsAreUniform;
+
// Optimisation when sub-tree contains only duplicates
  // Find dimension with largest variance to split on
+
private boolean singularity = true;
  double largestVar = -1;
+
  int splitDim = 0;
+
// Optimisation for searches. This lets us skip a node if its
  for(int d = 0; d < tree.dimensions; d++) {
+
// scope intersects with a search hypersphere but it doesn't contain
    final double var =
+
// any points that actually intersect.
      tree.exSumSqDev[d]/(tree.exemplars.size() - 1);
+
private double[] contentMax = null, contentMin = null;
    if (var > largestVar) {
+
      largestVar = var;
+
private int
      splitDim = d;
+
dimensions() { return contentMax.length; }
    }
+
  }
+
// Addition
 +
 +
// Adds an exemplar without splitting overflowing leaves.
 +
// Returns leaf to which exemplar was added.
 +
private static <X extends Exemplar> OptKdTree<X>
 +
addNoSplit(OptKdTree<X> tree, X ex) {
 +
  // Some spurious function calls. Optimised for readability over
 +
  // efficiency.
 +
  OptKdTree<X> cursor = tree;
 +
  while (cursor != null) {
 +
    updateBounds(cursor, ex);
 +
    if (cursor.isTree()) {
 +
      // Sub-tree
 +
      cursor = ex.domain[cursor.splitDim] <= cursor.split
 +
        ? cursor.left : cursor.right;
 +
    } else {
 +
      // Leaf
 +
 +
      // Add exemplar to leaf
 +
      cursor.data.add(ex);
 +
 +
      // Calculate running mean and sum of squared deviations
 +
      final int nExs = cursor.data.size();
 +
      final int dims = cursor.dimensions();
 +
      if (nExs == 1) {
 +
        cursor.exMean = Arrays.copyOf(ex.domain, dims);
 +
        cursor.exSumSqDev = new double[dims];
 +
      } else {
 +
        for(int d = 0; d < dims; d++) {
 +
          final double coord = ex.domain[d];
 +
          final double oldMean = cursor.exMean[d], newMean;
 +
          cursor.exMean[d] = newMean =
 +
            oldMean + (coord - oldMean)/nExs;
 +
          cursor.exSumSqDev[d] = cursor.exSumSqDev[d]
 +
            + (coord - oldMean)*(coord - newMean);
 +
        }
 +
      }
 +
 +
      // Check that data are still uniform
 +
      if (cursor.singularity) {
 +
        final Queue<X> cExs = cursor.data;
 +
        if (cExs.size() > 0 && !ex.collocated(cExs.peek()))
 +
          cursor.singularity = false;
 +
      }
 +
 +
      // Finished walking
 +
      return cursor;
 +
    }
 +
  }
 +
  throw new RuntimeException("Walked tree without adding anything");
 +
}
  
  // Find mean as position for our split
+
To update the bounding hyperrect for the contents of a tree, we iterate through each dimension, extending (if necessary) the bounds in that dimension to contain the new exemplar.
  double splitValue = tree.exMean[splitDim];
 
  
  // Check that our split actually splits our data. This also lets
+
private static <T extends Exemplar> void
  // us bulk load exemplars into sub-trees, which is more likely
+
updateBounds(OptKdTree<T> tree, Exemplar ex) {
  // to keep optimal balance.
+
  final int dims = ex.domain.length;
  final List<T> leftExs = new LinkedList<T>();
+
  if (tree.contentMax == null) {
  final List<T> rightExs = new LinkedList<T>();
+
    tree.contentMax = Arrays.copyOf(ex.domain, dims);
  for(T s : tree.exemplars) {
+
    tree.contentMin = Arrays.copyOf(ex.domain, dims);
    if (s.domain[splitDim] <= splitValue)
+
  } else {
      leftExs.add(s);
+
    for(int d = 0; d < dims; d++) {
    else
+
      final double dimVal = ex.domain[d];
      rightExs.add(s);
+
      if (dimVal > tree.contentMax[d])
  }
+
        tree.contentMax[d] = dimVal;
  int leftSize = leftExs.size();
+
      else if (dimVal < tree.contentMin[d])
  final int treeSize = tree.exemplars.size();
+
        tree.contentMin[d] = dimVal;
  if (leftSize == treeSize || leftSize == 0) {
+
    }
    System.err.println(
+
  }
      "WARNING: Randomly splitting non-uniform tree");
+
}
    // We know the exemplars aren't all the same, so try picking
 
    // an exemplar and a dimension at random for our split point
 
  
    // This might take several tries, so we copy our exemplars to
+
===Splitting===
    // an array to speed up process of picking a random point
+
We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.
    Object[] exs = tree.exemplars.toArray();
 
    while (leftSize == treeSize || leftSize == 0) {
 
      leftExs.clear();
 
      rightExs.clear();
 
  
      splitDim = (int)
+
private static <T extends Exemplar> boolean
        Math.floor(Math.random()*tree.dimensions);
+
shouldSplit(OptKdTree<T> tree) {
      final int splitPtIdx = (int)
+
  return tree.data.size() > tree.bucketSize
        Math.floor(Math.random()*exs.length);
+
    && !tree.singularity;
      // Cast is inevitable consequence of java's inability to
+
}
      // create a generic array
 
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
 
      for(T s : tree.exemplars) {
 
        if (s.domain[splitDim] <= splitValue)
 
          leftExs.add(s);
 
        else
 
          rightExs.add(s);
 
      }
 
      leftSize = leftExs.size();
 
    }
 
  }
 
  
  // We have found a valid split. Start building our sub-trees
+
Thanks to our pre-computation, splitting is usually straight-forward. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension. This strategy might not successfully split the tree if exemplars are so close together that rounding errors in computing the mean result in the mean not lying strictly between the exemplars in a leaf. When this happens, we resort to repeatedly trying a dimension and an exemplar at random for our splitting point. We only call <code>split</code> if the leaf is not a singularity, so eventually we will find a point that does divide our exemplars.
  final BucketKdTree<T> left =
 
    new BucketKdTree<T>(tree.bucketSize, tree.dimensions);
 
  final BucketKdTree<T> right =
 
    new BucketKdTree<T>(tree.bucketSize, tree.dimensions);
 
  left.addAll(leftExs);
 
  right.addAll(rightExs);
 
  
  // Finally, commit the split
+
To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. Once we have found a successful split point, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.
  tree.splitDim = splitDim;
 
  tree.split = splitValue;
 
  tree.left = left;
 
  tree.right = right;
 
  
  // Let go of exemplars (and their running stats) held in this leaf
+
// Split properties. Not initialised until split occurs.
  tree.exemplars = null;
+
private int splitDim = 0;
  tree.exMean = tree.exSumSqDev = null;
+
private double split = Double.NaN;
}
+
</pre>
+
@SuppressWarnings("unchecked") private static <T extends Exemplar> void
 +
split(OptKdTree<T> tree) {
 +
  assert !tree.singularity;
 +
  // Find dimension with largest variance to split on
 +
  double largestVar = -1;
 +
  int splitDim = 0;
 +
  for(int d = 0; d < tree.dimensions(); d++) {
 +
    // Don't need to divide by number of data to find largest
 +
    // variance
 +
    final double var = tree.exSumSqDev[d];
 +
    if (var > largestVar) {
 +
      largestVar = var;
 +
      splitDim = d;
 +
    }
 +
  }
 +
 +
  // Find mean as position for our split
 +
  double splitValue = tree.exMean[splitDim];
 +
 +
  // Check that our split actually splits our data. This also lets
 +
  // us bulk load data into sub-trees, which is more likely
 +
  // to keep optimal balance.
 +
  final Queue<T> leftExs = new LinkedList<T>();
 +
  final Queue<T> rightExs = new LinkedList<T>();
 +
  for(T s : tree.data) {
 +
    if (s.domain[splitDim] <= splitValue)
 +
      leftExs.add(s);
 +
    else
 +
      rightExs.add(s);
 +
  }
 +
  int leftSize = leftExs.size();
 +
  final int treeSize = tree.data.size();
 +
  if (leftSize == treeSize || leftSize == 0) {
 +
    System.err.println(
 +
      "WARNING: Randomly splitting non-uniform tree");
 +
    // We know the data aren't all the same, so try picking
 +
    // an exemplar and a dimension at random for our split point
 +
 +
    // This might take several tries, so we copy our data to
 +
    // an array to speed up process of picking a random point
 +
    Object[] exs = tree.data.toArray();
 +
    while (leftSize == treeSize || leftSize == 0) {
 +
      leftExs.clear();
 +
      rightExs.clear();
 +
 +
      splitDim = (int)
 +
        Math.floor(Math.random()*tree.dimensions());
 +
      final int splitPtIdx = (int)
 +
        Math.floor(Math.random()*exs.length);
 +
      // Cast is inevitable consequence of java's inability to
 +
      // create a generic array
 +
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
 +
      for(T s : tree.data) {
 +
        if (s.domain[splitDim] <= splitValue)
 +
          leftExs.add(s);
 +
        else
 +
          rightExs.add(s);
 +
      }
 +
      leftSize = leftExs.size();
 +
    }
 +
  }
 +
 +
  // We have found a valid split. Start building our sub-trees
 +
  final OptKdTree<T> left = new OptKdTree<T>(tree.bucketSize);
 +
  final OptKdTree<T> right = new OptKdTree<T>(tree.bucketSize);
 +
  left.addAll(leftExs);
 +
  right.addAll(rightExs);
 +
 +
  // Finally, commit the split
 +
  tree.splitDim = splitDim;
 +
  tree.split = splitValue;
 +
  tree.left = left;
 +
  tree.right = right;
 +
 +
  // Let go of data (and their running stats) held in this leaf
 +
  tree.data.clear();
 +
  tree.exMean = tree.exSumSqDev = null;
 +
}
  
==Searching==
+
===Distance calculations===
Before we can start searching, we need to define our distance metric. This whole tutorial has been assuming euclidian distance will be used. Optimisations may have to be revised if you are using a different distance metric.
+
Before we can start searching, we need to define our distance metric. The distance calculation has already been introduced before.
  
<pre>
+
:Squared Euclidean distance:
// Gets distance from target of nearest point on hyper-rect defined
+
:<math>distance_{x\rightarrow y}^2 = \sum (x[d] - y[d])^2</math>
// by supplied min and max bounds
 
private static double
 
minDistanceSqFrom(double[] target, double[] min, double[] max) {
 
  // Note: profiling shows this is called lots of times, so it pays
 
  // to be well optimised
 
  double distanceSq = 0;
 
  for(int d = 0; d < target.length; d++) {
 
    final double coord = target[d];
 
    double nearCoord;
 
    if (((nearCoord = min[d]) > coord)
 
      || ((nearCoord = max[d]) < coord))
 
    {
 
      final double dst = nearCoord - coord;
 
      distanceSq += dst*dst;
 
    }
 
  }
 
  return distanceSq;
 
}
 
  
// Accessible to testing
+
This will be called for every point examined, so we should optimise it well.
static double
 
distanceSqFrom(double[] p1, double[] p2) {
 
  // Note: profiling shows this is called lots of times, so it pays
 
  // to be well optimised
 
  double dSq = 0;
 
  for(int d = 0; d < p1.length; d++) {
 
    final double dst = p1[d] - p2[d];
 
    if (dst != 0)
 
      dSq += dst*dst;
 
  }
 
  return dSq;
 
}
 
</pre>
 
  
Before we dive into searching, we will introduce two helper classes.
+
private static double
* <code>SearchStackEntry</code> stores a tree along with its minimum distance from the query point. We put these on our tree-walking stack instead of just trees so we don't have to re-compute this minimum distance when we next pop the tree.
+
distanceSqFrom(double[] p1, double[] p2) {
* <code>SearchState</code> holds some variables we want to be updated in after each nearest neighbour is added. Its sole purpose is to let us separate searching sub-trees and leaves into different methods without sacrificing performance.
+
  double dSq = 0;
 +
  for(int d = 0; d < p1.length; d++) {
 +
    final double dst = p1[d] - p2[d];
 +
    if (dst != 0)
 +
      dSq += dst*dst;
 +
  }
 +
  return dSq;
 +
}
  
<pre>
+
We will also need to find the minimum distance from a hyperrect to a query point. This will let us rule out trees whose content hyperrects are outside our search sphere. To find the minimum distance between a hyperrect and a given point, we need first find the closest point along bounds of that hyperrect to the point. This is easy to do, even if not immediately obvious.
  
//
+
:Along each dimension,
// class SearchStackEntry
+
:<math>nearestPoint[d]=\begin{cases}
//
+
min[d] & target[d]\le min[d]\\
// Stores a precomputed distance so we don't have to do it again
+
target[d] & min<target[d]<max[d]\\
// when we pop the tree off the search stack.
+
max[d] & target[d]\ge max[d]\end{cases}</math>
  
private static class SearchStackEntry<T extends Exemplar> {
+
After that, we just calculate the distance from our query point to this nearest point. The code below has been optimised because it gets called a lot during a search.
  public final double minDFromQ;
 
  public final BucketKdTree<T> tree;
 
  
  public SearchStackEntry(double minDFromQ, BucketKdTree<T> tree) {
+
// Gets distance from target of nearest point on hyper-rect defined
    this.minDFromQ = minDFromQ;
+
// by supplied min and max bounds
    this.tree = tree;
+
private static double
  }
+
minDistanceSqFrom(double[] target, double[] min, double[] max) {
}
+
  double distanceSq = 0;
 +
  for(int d = 0; d < target.length; d++) {
 +
    if (target[d] < min[d]) {
 +
      final double dst = min[d] - target[d];
 +
      distanceSq += dst*dst;
 +
    } else if (target[d] > max[d]) {
 +
      final double dst = max[d] - target[d];
 +
      distanceSq += dst*dst;
 +
    }
 +
  }
 +
  return distanceSq;
 +
}
  
//
+
===Searching===
// class SearchState
+
Before we dive into searching, we will introduce two helper classes.
//
+
* <code>SearchStackEntry</code> stores a tree along any data which will be useful to keep on the search stack. Currently, this additional information consists only of whether a bounds check should be done. This lets us skip a redundant bounds check on the root of the tree.
// Holds data about current state of the search. Used for live updating
 
// of pruning distance.
 
  
private static class SearchState {
+
//
  int nResults = 0;
+
// class SearchStackEntry
  double maxDistance = Double.POSITIVE_INFINITY;
+
//
  int nExsAtMaxD = 0;
+
}
+
private static class SearchStackEntry<T extends Exemplar> {
</pre>
+
  public final boolean needBoundsCheck;
 +
  public final OptKdTree<T> tree;
 +
 +
  public SearchStackEntry(boolean needBoundsCheck, OptKdTree<T> tree) {
 +
    this.needBoundsCheck = needBoundsCheck;
 +
    this.tree = tree;
 +
  }
 +
}
  
Our search method continuously pops our search stack, decides whether it's looking at a sub-tree or a leaf and passes the call on as appropriate.
+
* <code>SearchState</code> stores the current state of the search. Its sole purpose is to let us separate searching sub-trees and leaves into different methods. This breaks up a long method and makes profiling easier.
  
<pre>
+
//
public SortedMap<Double, List<T>>
+
// class SearchState
search(double[] query, int nMinResults) {
+
//
  // Forward to a static method to avoid accidental reference to
+
// Holds data about current state of the search. Used for live updating
  // instance variables while descending the tree
+
// of pruning distance.
  return search(this, query, nMinResults);
+
}
+
private static class SearchState<X extends Exemplar> {
 +
  final int nResults;
 +
  final PriorityQueue<PrioNode<X>> results;
 +
 +
  public SearchState(int nResults) {
 +
    this.nResults = nResults;
 +
    results = new PriorityQueue<PrioNode<X>>(nResults,
 +
      new Comparator<PrioNode<X>>() {
 +
 +
        public int
 +
        compare(PrioNode<X> o1, PrioNode<X> o2) {
 +
          return o1.priority == o2.priority ? 0
 +
            : o1.priority > o2.priority ? -1 : 1;
 +
        }
 +
 +
      });
 +
  }
 +
}
  
// May return more results than requested if multiple exemplars have
+
Our search method does a depth-first search using a stack. Each time we pop our stack, we do a bounds check if necessary, then defer to the appropriate method.
// same distance from target.
 
//
 
// Note: this function works with squared distances to avoid sqrt()
 
// operations
 
private static <T extends Exemplar> SortedMap<Double, List<T>>
 
search(BucketKdTree<T> tree, double[] query, int nMinResults) {
 
  // distance => list of points that distance away from query
 
  final NavigableMap<Double, List<T>> results =
 
    new TreeMap<Double, List<T>>();
 
  
  final SearchState state = new SearchState();
+
public Iterable<PrioNode<X>>
  final Deque<SearchStackEntry<T>> stack =
+
search(double[] query, int nResults) {
    new LinkedList<SearchStackEntry<T>>();
+
  // Forward to a static method to avoid accidental reference to
  stack.addFirst(new SearchStackEntry<T>(state.maxDistance, tree));
+
  // instance variables while descending the tree
  while (!stack.isEmpty()) {
+
  return search(this, query, nResults);
    final SearchStackEntry<T> entry = stack.removeFirst();
+
}
    final BucketKdTree<T> cur = entry.tree;
+
 +
private static <X extends Exemplar> Iterable<PrioNode<X>>
 +
search(OptKdTree<X> tree, double[] query, int nResults) {
 +
  final SearchState<X> state = new SearchState<X>(nResults);
 +
  final Deque<SearchStackEntry<X>> stack =
 +
    new LinkedList<SearchStackEntry<X>>();
 +
  if (tree.contentMin != null)
 +
    stack.addLast(new SearchStackEntry<X>(false, tree));
  
    if (cur.isTree()) {
+
  while (!stack.isEmpty()) {
      searchTree(query, nMinResults, cur, state, stack);
+
    final SearchStackEntry<X> entry = stack.removeLast();
    } else if (entry.minDFromQ <= state.maxDistance
+
    final OptKdTree<X> cur = entry.tree;
      || state.nResults < nMinResults)
+
    {
+
    if (entry.needBoundsCheck && state.results.size() >= nResults) {
      searchLeaf(query, nMinResults, cur, state, results);
+
      final double d = minDistanceSqFrom(query,
    }
+
        cur.contentMin, cur.contentMax);
  }
+
      if (d > state.results.lastKey())
 +
        continue;
 +
    }
 +
 +
    if (cur.isTree()) {
 +
      searchTree(query, cur, stack, state.results.peek().priority);
 +
    } else {
 +
      searchLeaf(query, cur, state);
 +
    }
 +
  }
 +
 +
  return state.results;
 +
}
  
  return results;
+
====Searching a Tree====
}
+
We basically want to add both sub-trees to the stack, if they are not empty. As a heuristic, we think the sub-tree on the same side of the split as our query point is more likely to contain closer points, so we search that sub-tree first by adding it to the stack last. You might try to avoid a distance calculation by not requiring a bounds check for the heuristically nearer sub-tree, but that does not seem to significantly improve performance.
</pre>
 
  
===Searching a sub-tree===
+
private static <X extends Exemplar> void
Here is where we see two of our optimisations come into play. For each non-empty sub-tree, we compute the minimum distance from the smallest hyperrect bounding that sub-tree's exemplars to our query point. If this distance exceeds our maximum distance, we can rule out the entire sub-tree. Of course, if we haven't found our k-potential nearest neighbours yet, then both sub-trees may contain potential nearest neighbours so we will add them to the search stack anyway.
+
searchTree(double[] query, OptKdTree<X> tree,
 +
  Deque<SearchStackEntry<X>> stack, double maxPriority)
 +
{
 +
  OptKdTree<X> nearTree = tree.left, farTree = tree.right;
 +
  if (query[tree.splitDim] > tree.split) {
 +
    nearTree = tree.right;
 +
    farTree = tree.left;
 +
  }
 +
 +
  // These variables let us skip empty sub-trees
 +
  boolean nearEmpty = nearTree.contentMin == null;
 +
  boolean farEmpty = farTree.contentMin == null;
 +
 +
  // Add nearest sub-tree to stack later so we descend it
 +
  // first. This is likely to constrict our max distance
 +
  // sooner, resulting in less visited nodes
 +
  if (!farEmpty && minDistanceSqFrom(query,farTree.contentMin,farTree.contentMax) < maxPriority) {
 +
    stack.addLast(new SearchStackEntry<X>(true, farTree));
 +
  }
 +
 +
  if (!nearEmpty && minDistanceSqFrom(query,nearTree.contentMin,nearTree.contentMax) < maxPriority) {
 +
    stack.addLast(new SearchStackEntry<X>(true, nearTree));
 +
  }
 +
}
  
As a heuristic, we want to search the sub-tree with an edge closest to the query point first. This is more likely to lead to earlier contraction of our search distance. This is why we add the farthest tree to our stack first.
+
====Searching a Leaf====
 +
By the time we reach a leaf, there is little left to optimise away. We must go through each exemplar in turn to see if it is a nearest neighbour to our query point.
  
<pre>
+
private static <X extends Exemplar> void
private static <T extends Exemplar> void
+
searchLeaf(double[] query, OptKdTree<X> leaf, SearchState<X> state) {
searchTree(double[] query, int nMinResults, BucketKdTree<T> tree,
+
  double exD = Double.NaN;
  SearchState searchState, Deque<SearchStackEntry<T>> stack)
+
  for(X ex : leaf.data) {
{
+
    if (!leaf.singularity || Double.isNaN(exD)) {
  // Left is presumed near. This is verified further down.
+
      exD = distanceSqFrom(query, ex.domain);
  BucketKdTree<T> nearTree = tree.left, farTree = tree.right;
+
    }
  // These variables let us skip empty sub-trees
+
  boolean nearEmpty = nearTree.minBounds == null;
+
    if (examine(exD, state)) {
  boolean farEmpty = farTree.minBounds == null;
+
      while (state.results.size() >= state.nResults) {
 +
        state.results.poll();
 +
      }
 +
 +
      state.results.offer(new PrioNode<X>(exD, ex));
 +
    }
 +
  }
 +
}
 +
 +
private static <X extends Exemplar> boolean
 +
examine(double distance, SearchState<X> state) {
 +
  return state.results.size() < state.nResults
 +
    || distance < state.results.peek().priority;
 +
}
  
  // Find distance from nearest possible point in each
+
===Evaluation===
  // sub-tree to query. If that is greater than max distance,
+
From the same benchmark run, our optimised tree obtains the following results:
  // we can rule out that sub-tree.
 
  double nearD = nearEmpty ? Double.POSITIVE_INFINITY
 
    : minDistanceSqFrom(query,
 
      nearTree.minBounds, nearTree.maxBounds);
 
  double farD = farEmpty ? Double.POSITIVE_INFINITY
 
    : minDistanceSqFrom(query,
 
      farTree.minBounds, farTree.maxBounds);
 
  
  // Swap near and far if they're incorrect
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
  if (farD < nearD) {
+
: Average searching time      = 0.5740 miliseconds
    final double tmpD = nearD;
+
: Average worst searching time = 2.3482 miliseconds
    final BucketKdTree<T> tmpTree = nearTree;
+
: Average adding time          = 0.4528 microseconds
    final boolean tmpEmpty = nearEmpty;
+
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
[...]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
[...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
 +
 +
This is promising. We are a whole order of magnitude faster and finally in the same league as other ''k''d-tree implementations on this wiki. We have not changed our API, so the following shim is only for completeness:
  
    nearD = farD;
+
* DuynOptKNNSearch.java
    nearTree = farTree;
 
    nearEmpty = farEmpty;
 
  
    farD = tmpD;
+
public class DuynOptKNNSearch extends KNNImplementation {
    farTree = tmpTree;
+
  final OptKdTree<StringExemplar> tree;
    farEmpty = tmpEmpty;
+
  public DuynOptKNNSearch(final int dimension) {
  }
+
    super(dimension);
 +
    tree = new OptKdTree<StringExemplar>();
 +
  }
 +
 +
  @Override public void
 +
  addPoint(final double[] location, final String value) {
 +
    tree.add(new StringExemplar(location, value));
 +
  }
 +
 +
  @Override public String
 +
  getName() {
 +
    return "duyn's optimised kd-tree";
 +
  }
 +
 +
  @Override
 +
  public KNNPoint[]
 +
  getNearestNeighbors(final double[] location, final int size) {
 +
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 +
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
 +
      final double distance = sr.priority;
 +
      final StringExemplar pt = sr.data;
 +
      justPoints.add(new KNNPoint(pt.value, distance));
 +
    }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
    return justPoints.toArray(retVal);
 +
  }
 +
 +
  class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
  }
 +
}
  
  // Add nearest sub-tree to stack later so we descend it
+
==A Better Results Heap==
  // first. This is likely to constrict our max distance
+
A TreeMap does unnecessary processing, keeping all its keys in order when all we really care about is the farthest tentative result we have found. By implementing a custom heap, we can squeeze even more performance out of our tree.
  // sooner, resulting in less visited nodes
 
  if (!farEmpty && (farD <= searchState.maxDistance
 
                    || searchState.nResults < nMinResults))
 
  {
 
    stack.addFirst(new SearchStackEntry<T>(farD, farTree));
 
  }
 
  
  if (!nearEmpty && (nearD <= searchState.maxDistance
+
In this section, we will implement an implicit binary heap. This structure does not have the best theoretical performance, but often does well in practice. More theoretically efficient heaps often rely on batching inserts, only fully processing them when a the top element is removed. Since we almost do one removal for every insert, this does not offer much benefit to us. For our usage pattern, the only heap faster than a binary heap is a buggy heap.
                    || searchState.nResults < nMinResults))
 
  {
 
    stack.addFirst(new SearchStackEntry<T>(nearD, nearTree));
 
  }
 
}
 
</pre>
 
  
===Searching a leaf===
+
===Implicit d-ary Heap===
When we reach a leaf, we finally have a chance to contract our search distance. If we haven't yet found our k-potential nearest neighbours yet, or if the exemplar we're checking lies on the edge of our max search distance, we blindly add exemplars and update our max search distance.
+
Implicit binary heaps can be very memory efficient since all elements are stored in an array. They don't grow as easily as other heaps, but for our purposes that is not a problem&mdash;we can set a max on the number of results we are interested in. A good description of implicit heaps is provided by John Bentley in [http://www.akira.ruc.dk/~keld/teaching/algoritmedesign_f04/Artikler/02/ Programming Pearls: Thanks, Heaps].
  
Otherwise, we check the exemplar's distance from our query point. If it's closer than our furthest potential match, we drop our furthest result (unless doing so would result in too little k-potential neighbours) to make room for our new neighbour. Finally, we update our max search distance so we can skip any trees or points outside our new search distance.
+
Below, we implement a d-ary heap. The math is similar to a binary heap, and there are some suggestions that a 4-way heap may be more efficient than a binary heap. Our heap can be used as a min- or a max-heap. Using the enum <code>Direction</code> ensures the user doesn't get this parameter wrong.
  
If all the exemplars in a leaf are uniform, we don't need to do this check for every exemplar. We just do it for the first one, and either add or reject them all.
+
// Final for performance, may be removed if sub-classing necessary
 +
public final class FastBinaryHeap<T> implements Iterable<PrioNode<T>> {
 +
  private final int branches;
 +
  // Direction = -1 or 1
 +
  // root * direction > element * direction
 +
  private final int direction;
 +
  public static enum Direction {
 +
    MIN(-1), MAX(1);
 +
 +
    public final int value;
 +
    private Direction(int value) {
 +
      this.value = value;
 +
    }
 +
  }
 +
  public static final Direction MAX = Direction.MAX, MIN = Direction.MIN;
 +
 +
  private final PrioNode<T>[] data;
 +
  private int size = 0;
 +
  private static final int DEFAULT_N_CHILDREN = 4;
 +
 +
  public FastBinaryHeap(int capacity, Direction direction) {
 +
    this(capacity, DEFAULT_N_CHILDREN, direction);
 +
  }
 +
 +
  @SuppressWarnings("unchecked")
 +
  public FastBinaryHeap(int capacity, int branches, Direction dir) {
 +
    this.branches = branches;
 +
    // This implicit conversion is safe because we only refer to
 +
    // this array through our data member.
 +
    this.data = new PrioNode[capacity];
 +
    this.direction = dir.value;
 +
  }
 +
}
  
<pre>
+
We implement an interface similar to the standard PriorityQueue:
private static <T extends Exemplar> void
 
searchLeaf(double[] query, int nMinResults,
 
  BucketKdTree<T> leaf, SearchState searchState,
 
  NavigableMap<Double, List<T>> results)
 
{
 
  // Keep track of elements at max distance so we know
 
  // whether we can just drop entire list of furthest
 
  // exemplars
 
  final int nMinResultsBeforeAddition = nMinResults - 1;
 
  for(T ex : leaf.exemplars) {
 
    final double exD = distanceSqFrom(query, ex.domain);
 
    if (searchState.nResults < nMinResults
 
      || exD == searchState.maxDistance)
 
    {
 
      // Blindly add this exemplar
 
      List<T> exsAtD = getOrElseInit(results, exD);
 
      if (leaf.exemplarsAreUniform) {
 
        // No need to go through every one if all
 
        // exemplars are the same
 
        exsAtD.addAll(leaf.exemplars);
 
        searchState.nResults += leaf.exemplars.size();
 
      } else {
 
        exsAtD.add(ex);
 
        searchState.nResults++;
 
      }
 
  
      // Update information about furthest exemplars
+
public Iterator<PrioNode<T>>
      if (exD > searchState.maxDistance
+
iterator() {
        || searchState.maxDistance == Double.POSITIVE_INFINITY)
+
  return new Iterator<PrioNode<T>>() {
      {
+
        searchState.maxDistance = exD;
+
    int idx = 0;
      }
+
 +
    public boolean
 +
    hasNext() { return idx < size; }
 +
 +
    public PrioNode<T>
 +
    next() { return data[idx++]; }
 +
 +
    public void
 +
    remove() {
 +
      throw new UnsupportedOperationException("Not supported.");
 +
    }
 +
 +
  };
 +
}
 +
 +
public int
 +
size() { return size; }
 +
 +
public PrioNode<T>
 +
peek() { return data[0]; }
 +
 +
Our offer() method takes a priority separately from the data we want to associate with it. We will store the priority separately so we don't fall into the same performance trap as PriorityQueue. We do not implement a poll() method because we don't need it. If we are about to overflow, we will just replace our top element with the new one.
  
      if (searchState.maxDistance == exD)
+
public boolean
        searchState.nExsAtMaxD = exsAtD.size();
+
offer(double priority, T result) {
 +
  if (size < data.length) {
 +
    // Trivial case
 +
    siftUp(size, new PrioNode<T>(priority, result));
 +
    size++;
 +
    return true;
 +
  }
 +
 +
  if (priority*direction > data[0].priority*direction) {
 +
    System.err.printf("Do not want %f when have %f\n",
 +
      priority, data[0].priority);
 +
    return false;
 +
  }
 +
 +
  siftDown(0, new PrioNode<T>(priority, result));
 +
  return true;
 +
}
  
    } else if (exD < searchState.maxDistance) {
+
Now to the meat of our heap.
      // Point closer than furthest neighbour
 
  
      if (searchState.nResults - searchState.nExsAtMaxD
+
* siftUp treats the given index as empty and progressively movies parents into the hole left behind if inserting the given element there would violate heap order.
        >= nMinResultsBeforeAddition)
+
: Parents are calculated as follows:
      {
+
: <math>parent[i] = \left\lfloor {i - 1\over branches} \right\rfloor</math>
        // Dropping furthest exemplars won't leave
 
        // us with too little to meet return
 
        // minimum
 
        results.remove(searchState.maxDistance);
 
        searchState.nResults -= searchState.nExsAtMaxD;
 
      }
 
  
      // Add new exemplar
+
: Once we have that, we can compare the values at both indexes. Multiplying them by <code>direction</code> lets us use the same code for both a min- and a max-heap.
      List<T> exsAtD = getOrElseInit(results, exD);
+
      if (leaf.exemplarsAreUniform) {
+
private void
        // No need to go through every one if all
+
siftUp(int hole, PrioNode<T> value) {
        // exemplars are the same
+
  if (size > 0) {
        exsAtD.addAll(leaf.exemplars);
+
    final double vPD = value.priority*direction;
        searchState.nResults += leaf.exemplars.size();
+
    while (hole > 0) {
      } else {
+
      final int parent = (hole - 1)/branches;
        exsAtD.add(ex);
+
      if (vPD > data[parent].priority*direction) {
        searchState.nResults++;
+
        data[hole] = data[parent];
      }
+
        hole = parent;
 +
      } else {
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 +
  data[hole] = value;
 +
}
  
      // Update information about furthest exemplars
+
* siftDown does the opposite, starting from the given index and moving elements up if inserting the new value into the hole would violate heap order. It swaps the element with its worst heap-offending child so that with a single swap, it can restore heap order to that level.
      Map.Entry<Double, List<T>> lastEnt = results.lastEntry();
+
: In a 0-based index, children are calculated as follows:
      searchState.maxDistance = lastEnt.getKey();
+
: <math>\begin{align}child_{1}[i] & =branches\times i+1\\
      searchState.nExsAtMaxD = lastEnt.getValue().size();
+
\ldots\\
    } // exD < maxDistance
+
child_{branches}[i] & =branches\times i+branches\end{align}</math>
  
    // No need to keep going if all exemplars are
+
private void
    // the same
+
siftDown(int hole, PrioNode<T> value) {
    if (leaf.exemplarsAreUniform) break;
+
  if (size > 0) {
  } // for(T ex : cur.exemplars)
+
    // Push down along min path
}
+
    final double vPD = value.priority*direction;
 +
    int firstChild;
 +
    while ((firstChild = branches*hole + 1) < data.length) {
 +
 +
      // Find largest/smallest child
 +
      double swapPD = data[firstChild].priority*direction;
 +
      int swapChild = firstChild;
 +
      for(int c = firstChild + 1;
 +
        c < firstChild + branches && c < size; c++)
 +
      {
 +
        final double cPD = data[c].priority*direction;
 +
        if (cPD > swapPD) {
 +
          swapPD = cPD;
 +
          swapChild = c;
 +
        }
 +
      }
 +
 +
      if (swapPD > vPD) {
 +
        data[hole] = data[swapChild];
 +
        hole = swapChild;
 +
      } else {
 +
        break;
 +
      }
 +
    }
 +
  }
 +
 +
  data[hole] = value;
 +
}
  
private static <T extends Exemplar> List<T>
+
Both of these are slightly different from standard textbook presentations of the algorithm. The versions here do less swaps by shifting other elements out of the way until the new element's rightful place has been found.
getOrElseInit(Map<Double, List<T>> results, double dst) {
 
  List<T> lst = results.get(dst);
 
  if (lst == null) {
 
    lst = new LinkedList<T>();
 
    results.put(dst, lst);
 
  }
 
  return lst;
 
}
 
</pre>
 
  
==Assessment==
+
===Changes to our Tree===
Initial performance benchmarks are promising. Randomised JUnit tests indicate (not very rigorously) that searching can be 2&ndash;7x faster than a linear search. Using the [[k-NN algorithm benchmark]] with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data] yields the following performance:
+
Because the only change we have made is to the method of collecting results, we will skim over the changes necessary.
  
<pre>
+
public final class FastKdTree<X extends Exemplar> {
K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
+
....
-----------------------------------------
+
}
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
 
Read 25621 saves and 10300 searches.
 
  
Running 30 repetition(s) for k-nearest neighbours searching:
+
* Our SearchState will have to be updated to use our new heap:
:: 13 dimension(s); 40 neighbour(s)
 
Warming up the JIT with 5 repetitions first...
 
  
Running tests... COMPLETED.
+
//
 +
// class SearchState
 +
//
 +
// Holds data about current state of the search. Used for live updating
 +
// of pruning distance.
 +
 +
private static class SearchState<X extends Exemplar> {
 +
  final int nResults;
 +
  final FastBinaryHeap<X> results;
 +
 +
  public SearchState(int nResults) {
 +
    this.nResults = nResults;
 +
    results = new FastBinaryHeap<X>(
 +
      nResults, 4, FastBinaryHeap.MAX);
 +
  }
 +
}
  
RESULT << k-nearest neighbours search with Voidious' Linear search >>
+
* Our searchLeaf() method no longer needs to explicitly poll the top since our heap will just replace it if it has no space:
: Average searching time      = 0.57 miliseconds
 
: Average worst searching time = 14.992 miliseconds
 
: Average adding time          = 0.55 microseconds
 
: Accuracy                    = 100%
 
  
RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
+
private static <X extends Exemplar> void
: Average searching time      = 0.16 miliseconds
+
searchLeaf(double[] query, FastKdTree<X> leaf, SearchState<X> state) {
: Average worst searching time = 20.109 miliseconds
+
  double exD = Double.NaN;
: Average adding time          = 1.41 microseconds
+
  for(X ex : leaf.data) {
: Accuracy                    = 100%
+
    if (!leaf.singularity || Double.isNaN(exD)) {
 +
      exD = distanceSqFrom(query, ex.domain);
 +
    }
 +
 +
    if (examine(exD, state)) {
 +
      state.results.offer(exD, ex);
 +
    }
 +
  }
 +
}
  
RESULT << k-nearest neighbours search with Nat's Bucket PR k-d tree >>
+
* We have also replaced all instances of LinkedLists with ArrayDeques. This is because although theoretically LinkedLists are appropriate, the javadocs for ArrayDeque state that:
: Average searching time      = 0.361 miliseconds
+
: "This class is likely to be faster than Stack when used as a stack, and faster than LinkedList  when used as a queue."&mdash;[http://java.sun.com/javase/6/docs/api/java/util/ArrayDeque.html Java 6 API docs]
: Average worst searching time = 304.663 miliseconds
 
: Average adding time          = 66.64 microseconds
 
: Accuracy                    = 31%
 
  
RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
+
public FastKdTree(int bucketSize) {
: Average searching time      = 0.216 miliseconds
+
  this.bucketSize = bucketSize;
: Average worst searching time = 100.556 miliseconds
+
  this.data = new ArrayDeque<X>();
: Average adding time          = 1.58 microseconds
+
}
: Accuracy                    = 100%
+
 +
@SuppressWarnings("unchecked") private static <X extends Exemplar> void
 +
split(FastKdTree<X> tree) {
 +
  ...
 +
  final Queue<X> leftExs = new ArrayDeque<X>();
 +
  final Queue<X> rightExs = new ArrayDeque<X>();
 +
  ...
 +
}
 +
 +
private static <X extends Exemplar> Iterable<PrioNode<X>>
 +
search(FastKdTree<X> tree, double[] query, int nResults) {
 +
  final SearchState<X> state = new SearchState<X>(nResults);
 +
  final Deque<SearchStackEntry<X>> stack =
 +
    new ArrayDeque<SearchStackEntry<X>>();
 +
  ...
 +
}
  
RESULT << k-nearest neighbours search with duyn's Bucket kd-tree >>
+
===Evaluation===
: Average searching time      = 0.079 miliseconds
+
Our improved heap squeezes a bit more performance (~10%) out of our tree:
: Average worst searching time = 17.476 miliseconds
 
: Average adding time          = 2.45 microseconds
 
: Accuracy                    = 100%
 
  
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree >>
+
RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time      = 0.053 miliseconds
+
: Average searching time      = 0.5740 miliseconds
: Average worst searching time = 0.296 miliseconds
+
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 1.69 microseconds
+
: Average adding time          = 0.4528 microseconds
: Accuracy                    = 100%
+
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
 +
: Average searching time      = 0.0479 miliseconds
 +
: Average worst searching time = 25.0995 miliseconds
 +
: Average adding time          = 2.1288 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
[...]
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
[...]
 +
  - #3 duyn's fast kd-tree [0.0479]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
[...]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
  
 +
Our client code has barely changed:
  
BEST RESULT:
+
* DuynFastKNNSearch.java
  - #1 Rednaxela's Bucket kd-tree [0.0529]
+
  - #2 duyn's Bucket kd-tree [0.0787]
+
  public class DuynFastKNNSearch extends KNNImplementation {
- #3 Simonton's Bucket PR k-d tree [0.1598]
+
  final FastKdTree<StringExemplar> tree;
- #4 Voidious' Bucket PR k-d tree [0.2165]
+
  public DuynFastKNNSearch(final int dimension) {
  - #5 Nat's Bucket PR k-d tree [0.3605]
+
    super(dimension);
  - #6 Voidious' Linear search [0.5699]
+
    tree = new FastKdTree<StringExemplar>();
 +
  }
 +
 +
  @Override public void
 +
  addPoint(final double[] location, final String value) {
 +
    tree.add(new StringExemplar(location, value));
 +
  }
 +
   
 +
  @Override public String
 +
  getName() {
 +
    return "duyn's fast kd-tree";
 +
  }
 +
 +
  @Override public KNNPoint[]
 +
  getNearestNeighbors(final double[] location, final int size) {
 +
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 +
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
 +
      final double distance = sr.priority;
 +
      final StringExemplar pt = sr.data;
 +
      justPoints.add(new KNNPoint(pt.value, distance));
 +
    }
 +
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 +
    return justPoints.toArray(retVal);
 +
  }
 +
   
 +
  class StringExemplar extends Exemplar {
 +
    public final String value;
 +
    public StringExemplar(final double[] coords, final String value) {
 +
      super(coords);
 +
      this.value = value;
 +
    }
 +
  }
 +
  }
  
Benchmark running time: 545.43 seconds
+
==Full Benchmark Results==
</pre>
+
Below are the full benchmark results quoted throughout this page. It puts the performance of our trees in context compared to other ''k''d-trees available on this wiki. The benchmark ([[k-NN algorithm benchmark]]) was run with the [http://homepages.ucalgary.ca/~agschult/robocode/gun-data-Diamond-vs-jk.mini.CunobelinDC%200.3.csv.gz Diamond vs CunobelinDC gun data].
  
This is not an exhaustive test, but it shows we're off to a nice start. The following code was used to run the benchmark:
+
K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
 +
-----------------------------------------
 +
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
 +
Read 25621 saves and 10300 searches.
 +
 +
Running 30 repetition(s) for k-nearest neighbours searching:
 +
:: 13 dimension(s); 40 neighbour(s)
 +
Warming up the JIT with 5 repetitions first...
 +
 +
Running tests...test 1/30
 +
[...]
 +
  COMPLETED.
 +
 +
RESULT << k-nearest neighbours search with Voidious' Linear search >>
 +
: Average searching time      = 0.5740 miliseconds
 +
: Average worst searching time = 2.3482 miliseconds
 +
: Average adding time          = 0.4528 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
 +
: Average searching time      = 0.1609 miliseconds
 +
: Average worst searching time = 1.4184 miliseconds
 +
: Average adding time          = 1.3322 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
 +
: Average searching time      = 0.2081 miliseconds
 +
: Average worst searching time = 21.9191 miliseconds
 +
: Average adding time          = 1.0301 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
 +
: Average searching time      = 0.0378 miliseconds
 +
: Average worst searching time = 0.2625 miliseconds
 +
: Average adding time          = 1.5687 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Rednaxela's kd-tree (3rd gen) >>
 +
: Average searching time      = 0.0407 miliseconds
 +
: Average worst searching time = 0.2259 miliseconds
 +
: Average adding time          = 1.6342 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
 +
: Average searching time      = 0.3939 miliseconds
 +
: Average worst searching time = 11.5167 miliseconds
 +
: Average adding time          = 0.5165 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
 +
: Average searching time      = 0.0536 miliseconds
 +
: Average worst searching time = 1.6742 miliseconds
 +
: Average adding time          = 2.9380 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
 +
: Average searching time      = 0.0479 miliseconds
 +
: Average worst searching time = 25.0995 miliseconds
 +
: Average adding time          = 2.1288 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
RESULT << k-nearest neighbours search with Chase-san's kd-tree (KDTreeC) >>
 +
: Average searching time      = 0.1200 miliseconds
 +
: Average worst searching time = 19.5727 miliseconds
 +
: Average adding time          = 1.4222 microseconds
 +
: Accuracy                    = 100.0%
 +
 +
 +
BEST RESULT:
 +
  - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 +
  - #2 Rednaxela's kd-tree (3rd gen) [0.0407]
 +
  - #3 duyn's fast kd-tree [0.0479]
 +
  - #4 duyn's optimised kd-tree [0.0536]
 +
  - #5 Chase-san's kd-tree (KDTreeC) [0.1200]
 +
  - #6 Simonton's Bucket PR k-d tree [0.1609]
 +
  - #7 Voidious' Bucket PR k-d tree [0.2081]
 +
  - #8 duyn's basic kd-tree [0.3939]
 +
  - #9 Voidious' Linear search [0.5740]
 +
 +
Benchmark running time: 587.77 seconds
  
* KNNBenchmark.java
+
==Full Source Code==
  <pre>
+
The full source code for all three trees presented in this tutorial can be downloaded together: [[File:duyn_tutorial_kd_trees.jar]]. The code is available under the [[RWPCL]].
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
 
  final Class<?>[] searchAlgorithms = new Class<?>[] {
 
    FlatKNNSearch.class,
 
    SimontonTreeKNNSearch.class,
 
    NatTreeKNNSearch.class,
 
    VoidiousTreeKNNSearch.class,
 
+    DuynTreeImpl.class,
 
    RednaxelaTreeKNNSearch.class,
 
  };
 
  ...
 
}
 
  </pre>
 
  
* DuynTreeImpl.java
+
==Extensions==
  <pre>
+
This tree does not include any support for deletion or re-balancing. You can get away with this if you plan to re-build the entire tree regularly.
package net.robowiki.knn.implementations;
 
import net.robowiki.knn.util.KNNPoint;
 
import java.util.*;
 
// Namespace I use for my kd-tree
 
import dn.j1.algorithm.nearestneighbours.*;
 
  
public class DuynTreeImpl extends KNNImplementation {
+
'''Deletion''' support would be easy to add to this tree. Since all exemplars are stored in leaves at the bottom, we need only walk the tree and remove the exemplar from the leaf containing it. We would then merge sub-trees if multiple deletions has left one of them empty. Unless our bucket width is really small, we don't need to merge sub-trees after each deletion since the addition of a subsequent exemplar is unlikely to result in choosing a significantly different split point.
  final BucketKdTree<KNNPointAdapter> tree;
 
  public DuynTreeImpl(final int dimension) {
 
    super(dimension);
 
    tree = new BucketKdTree<KNNPointAdapter>(10, dimension);
 
  }
 
  
  @Override public void
+
'''Re-balancing''' with this tree is trickier. It does not currently store depth, so we don't know when a re-balance is needed. Because exemplars are only stored in leaves, collecting all the exemplars under a sub-tree would require exhaustively walking that sub-tree.
  addPoint(final double[] location, final String value) {
 
    tree.add(new KNNPointAdapter(location, value));
 
  }
 
  
  @Override public String
+
* If a depth is to be stored, each tree would need a pointer to its parent tree so we propogate depth updates when a leaf is split.
  getName() {
 
    return "duyn's Bucket kd-tree";
 
  }
 
  
  @Override public KNNPoint[]
+
* We can avoid an exhaustive walk by storing each exemplar in every tree on its path from root to leaf. We then can re-balance easily by dropping both sub-trees and splitting the given tree anew. This approach has the down side of making inserts and deletes more complex since we have to update the exemplar list of every tree along the path from root to leaf.
  getNearestNeighbors(final double[] location, final int size) {
 
    final SortedMap<Double, List<KNNPointAdapter>> results = tree.search(location, size);
 
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
 
    for(final Map.Entry<Double, List<KNNPointAdapter>> entry : results.entrySet()) {
 
      final double distance = entry.getKey();
 
      for(final KNNPointAdapter pt : entry.getValue()) {
 
        justPoints.add(new KNNPoint(pt.value, distance));
 
      }
 
    }
 
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
 
    return justPoints.toArray(retVal);
 
  }
 
  
  class KNNPointAdapter extends Exemplar {
+
==See also==
    public final String value;
+
* [[Kd-tree]] for further information on kd-trees.
    public KNNPointAdapter(final double[] coords, final String value) {
+
* Andrew Moore, ''[http://www.autonlab.org/autonweb/14665 An intoductory tutorial on kd]''. A good introduction to kd-trees.
      super(coords);
+
* Neal Sample, Matthew Haines, Mark Arnold, Timothy Purcell, ''[http://ilpubs.stanford.edu:8090/723/ Optimizing Search Strategies in k-d Trees]''. Details some of the optimisations used in this tree.
      this.value = value;
 
    }
 
  }
 
}
 
  </pre>
 
 
 
==Full Source Code==
 
For the full source code to the tree built in this tutorial, see [[User:Duyn/BucketKdTree|duyn's Bucket kd-tree]].
 

Latest revision as of 15:43, 27 January 2012

[Work-in-Progress]

This page will walk you through the implementation of a kd-tree. We start with a basic tree and progressively add levels of optimisation. Writing a fully optimised kd-tree is an advanced task, but starting with the basics helps. There are already other kd-trees on this wiki, see some of the others for ideas.

Theory

A k-d tree is a binary tree which successively splits a k-dimensional space in half. This lets us speed up a nearest neighbour search by not examining points in partitions which are too far away. A k-nearest neighbour search can be implemented from a nearest neighbour algorithm by not shrinking the search radius until k items have been found. This page won't make a distinction between the two.

An Exemplar class

We will call each item in our k-d tree an exemplar. Each exemplar has a domain—its co-ordinates in the k-d tree's space. Each exemplar could also have an arbitrary payload, but our tree does not need to know about that. It will only handle storing exemplars based on their domain and returning them in a nearest neighbour search.

We might already have a class elsewere called Point which handles 2D co-ordinates. This terminology avoids conflict with that.

public class Exemplar {
  public final double[] domain;

  public Exemplar(double[] domain) {
    this.domain = domain;
  }

  public final boolean
  collocated(final Exemplar other) {
    return Arrays.equals(domain, other.domain);
  }
}

While this class is fully usable as is, rarely will the domain of nearest neighbours be of any interest. Often, useful data (such as GuessFactors) will be loaded by sub-classing this Exemplar class. Our k-d tree will be parameterised based on this expectation.

Basic Tree

First, we will build a basic kd-tree as described by Wikipedia's kd-tree page. We start off with a standard binary tree. Each tree is either a node with both left and right sub-trees defined, or a leaf carrying an exemplar. For simplicity, we won't allow nodes to contain any exemplars.

public class BasicKdTree<X extends Exemplar> {
  // Basic tree structure
  X data = null;
  BasicKdTree<X> left = null, right = null;

  // Only need to test one branch since we always populate both
  // branches at once
  private boolean
  isTree() { return left != null; }

  ...
}

Adding

Each of the public API functions defers to a private static method. This is to avoid accidentally referring to instance variables while we walk the tree. This is a common pattern we will use for much of the actual behaviour code for this tree.

public void
add(X ex) {
  BasicKdTree.addToTree(this, ex);
}

To add an exemplar, we traverse the tree from top down until we find a leaf. If the leaf doesn't already have an exemplar, we can just stow it there. Otherwise, we split the leaf and put the free exemplars in each sub-tree.

To split a leaf, we cycle through split dimensions in order as in most descriptions of kd-trees. Because we only allow each leaf to hold a single exemplar, we can use a simple splitting strategy: smallest on the left, largest on the right. The split value is half way between the two points along the split dimension.

int splitDim = 0;
double split = Double.NaN;

private static <X extends Exemplar> void
addToTree(BasicKdTree<X> tree, X ex) {
  while(tree != null) {
    if (tree.isTree()) {
      // Traverse in search of a leaf
      tree = ex.domain[tree.splitDim] <= tree.split
        ? tree.left : tree.right;
    } else {
      if (tree.data == null) {
        tree.data = ex;
      } else {
        // Split tree and add
        // Find smallest exemplar to be our split point
        final int d = tree.splitDim;
        X leftX = ex, rightX = tree.data;
        if (rightX.domain[d] < leftX.domain[d]) {
          leftX = tree.data;
          rightX = ex;
        }
        tree.split = 0.5*(leftX.domain[d] + rightX.domain[d]);
        
        final int nextSplitDim =
          (tree.splitDim + 1)%tree.dimensions();
        
        tree.left = new BasicKdTree<X>();
        tree.left.splitDim = nextSplitDim;
        tree.left.data = leftX;
        
        tree.right = new BasicKdTree<X>();
        tree.right.splitDim = nextSplitDim;
        tree.right.data = rightX;
      }

      // Done.
      tree = null;
    }
  }
}

private int
dimensions() { return data.domain.length; }

Searching

Before we start coding a search method, we need a helper class to store search results along with their distance from the query point. This is called PrioNode because we will eventually re-use it to implement a custom priority queue.

public final class PrioNode<T> {
  public final double priority;
  public final T data;

  public PrioNode(double priority, T data) {
    this.priority = priority;
    this.data = data;
  }
}

Like with our add() method, our search() method delegates to a static method to avoid introducing bugs by accidentally referring to member variables while we descend the tree.

public Iterable<? extends PrioNode<X>>
search(double[] query, int nResults) {
  return BasicKdTree.search(this, query, nResults);
}

To do a nearest neighbours search, we walk the tree, preferring to search sub-trees which are on the same side of the split as the query point first. Once we have found our initial candidates, we can contract our search sphere. We only search the other sub-tree if our search sphere might spill over onto the other side of the split.

Results are collected in a java.util.PriorityQueue so we can easily remove the farthest exemplars as we find closer ones.

private static <X extends Exemplar> Iterable<? extends PrioNode<X>>
search(BasicKdTree<X> tree, double[] query, int nResults) {
  final Queue<PrioNode<X>> results =
    new PriorityQueue<PrioNode<X>>(nResults,
      new Comparator<PrioNode<X>>() {

        // min-heap
        public int
        compare(PrioNode<X> o1, PrioNode<X> o2) {
          return o1.priority == o2.priority ? 0
            : o1.priority > o2.priority ? -1
            : 1;
        }

      }
    );
  final Deque<BasicKdTree<X>> stack
    = new LinkedList<BasicKdTree<X>>();
  stack.addLast(tree);
  while (!stack.isEmpty()) {
    tree = stack.removeLast();
    
    if (tree.isTree()) {
      // Guess nearest tree to query point
      BasicKdTree<X> nearTree = tree.left, farTree = tree.right;
      if (query[tree.splitDim] > tree.split) {
        nearTree = tree.right;
        farTree = tree.left;
      }
      
      // Only search far tree if our search sphere might
      // overlap with splitting plane
      if (results.size() < nResults
        || sq(query[tree.splitDim] - tree.split)
          <= results.peek().priority)
      {
        stack.addLast(farTree);
      }

      // Always search the nearest branch
      stack.addLast(nearTree);
    } else {
      final double dSq = distanceSqFrom(query, tree.data.domain);
      if (results.size() < nResults
        || dSq < results.peek().priority)
      {
        while (results.size() >= nResults) {
          results.poll();
        }

        results.offer(new PrioNode<X>(dSq, tree.data));
      }
    }
  }
  return results;
}
 
private static double
sq(double n) { return n*n; }

Our distance calculation is optimised because it will be called often. We use squared distances to avoid an unnecessary sqrt() operation. We also don't use Math.pow() because the JRE's default implementation must do some extra work before it can return with a simple multiplication (see the source in fdlibm).

private static double
distanceSqFrom(double[] p1, double[] p2) {
  // Note: profiling shows this is called lots of times, so it pays
  // to be well optimised
  double dSq = 0;
  for(int d = 0; d < p1.length; d++) {
    final double dst = p1[d] - p2[d];
    if (dst != 0)
      dSq += dst*dst;
  }
  return dSq;
}

And that's it. Barely 200 lines of code and we have ourselves a working kd-tree.

Evaluation

As a guide to performance, we will use the k-NN algorithm benchmark with the Diamond vs CunobelinDC gun data. For comparison, the benchmark included an optimised linear search to represent the lower bound of performance, and Rednaxela's tree—credited to be the fastest tree on the wiki. Full results are near the bottom of the page. Here are the relevant extracts for our tree:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

This test run showed that average add time was close to a linear search while searching performance improved by (0.3939/0.5740 - 1) ~= 31%. It's still an order of magnitude slower than Rednaxela's included tree, which shows there is a lot to gain by selecting appropriate optimisations.

The following code was used in the benchmark:

  • KNNBenchmark.java
public KNNBenchmark(final int dimension, final int numNeighbours, final SampleData[] samples, final int numReps) {
  final Class<?>[] searchAlgorithms = new Class<?>[] {
    FlatKNNSearch.class,
    SimontonTreeKNNSearch.class,
    NatTreeKNNSearch.class,
    VoidiousTreeKNNSearch.class,
+   DuynBasicKNNSearch.class,
    RednaxelaTreeKNNSearch.class,
  };
  ...
}
 
  • DuynBasicKNNSearch.java
public class DuynBasicKNNSearch extends KNNImplementation {
  final BasicKdTree<StringExemplar> tree;
  public DuynBasicKNNSearch(final int dimension) {
    super(dimension);
    tree = new BasicKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's basic kd-tree";
  }

  @Override public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

Optimisations to the Tree

In this section, we will introduce the following optimisations to our tree:

  • Bucket leaves. Leaves will store multiple exemplars, only being split when they overflow. This makes our split points less dependent on the order in which we receive points and gives us more opportunity to choose a better split point.
  • Bounds-overlap-ball testing. Each tree stores the bounds of the smallest hyperrect which will contain all the data in that tree. This is smaller—maybe significantly so—than the bounds which that sub-tree occupy, giving us more opportunities to skip sub-trees. We will check that the minimum distance from each tree's content bounds overlaps with our search hypersphere. This lets us avoid walking all the way to a leaf if a tree's content bounds overlap with our search hypersphere but none of its children's content bounds do.
  • Splitting choice. We will split on the mean of the dimension with the largest variance. Splitting on the mean does not guarantee that our tree will be balanced, but is easier for us since we would have already calculated the mean in finding the dimension with largest variance. This will add more overhead to our add times since we will compute running means and variances each time an exemplar is added.
  • Singularity tracking. We will actively check whether all the exemplars in a leaf have the same domain. This is necessary to avoid getting ourselves caught in an infinite loop trying to repeatedly split an unsplittable leaf. It also lets us avoid repeated distance calculations when searching for nearest neighbours. Practically, unless all your dimensions are limited, discrete values, this optimisation is unlikely to make much impact.

Basic Tree

Once again, we start off with a basic binary tree. Each tree must store its own bucket size since it will not have a reference to its parent Bucket size is not static because we might want to build multiple trees with different bucket sizes.

public final class OptKdTree<X extends Exemplar> {
  final Queue<X> data;
  OptKdTree<X> left = null, right = null;

  // Number of exemplars to hold in a leaf before splitting
  private final int bucketSize;
  private static final int DEFAULT_BUCKET_SIZE = 10;

  public OptKdTree() {
    this(DEFAULT_BUCKET_SIZE);
  }

  public OptKdTree(int bucketSize) {
    this.bucketSize = bucketSize;
    this.data = new LinkedList<X>();
  }

  private final boolean
  isTree() { return left != null; }

  [...]
}

Adding

Now that our code is a little more sophisticated, it becomes profitable to have a dedicated method to bulk-load our tree. This also makes it easier to add dynamic rebalance support later if we need it. To take advantage the additional information we get from bulk loading, we decide whether to split a leaf only after all exemplars have been added.

public void
add(X ex) {
  OptKdTree<X> tree = addNoSplit(this, ex);
  if (shouldSplit(tree)) {
    split(tree);
  }
}

public void
addAll(Collection<X> exs) {
  final Set<OptKdTree<X>> modTrees =
    new HashSet<OptKdTree<X>>();
  for(X ex : exs) {
    modTrees.add(addNoSplit(this, ex));
  }

  for(OptKdTree<X> tree : modTrees) {
    if (shouldSplit(tree)) {
      split(tree);
    }
  }
}

To add an exemplar, we traverse the tree from top down until we find a leaf. Then we add the exemplar to the list at that leaf. For now, there is no special ordering to the list of exemplars in a leaf. To decide which sub-tree to traverse, each tree stores two values—a splitting dimension and a splitting value. If our new exemplar's domain along the splitting dimension is greater than the tree's splitting value, we put it in the right sub-tree. Otherwise, it goes in the left one.

Since adding takes little time compared to searching, we take this opportunity to make some optimisations:

  • We keep track of the actual hyperrect the points in this tree occupy. This lets us rule out a tree, even though its space may intersect with our search sphere, if it has no chance of containing any points within our search sphere. This hyperrect is fully defined by just storing its two extreme corners.
  • To save us having to do a full iteration when we come to split a leaf, we compute the running mean and variance for each dimension using Welford's method:
   <math>
   M_1 = x_1,\qquad S_1 = 0</math>
   <math>
   M_k = M_{k-1} + {x_k - M_{k-1} \over k} \qquad(exMean)</math>
   <math>
   S_k = S_{k-1} + (x_k - M_{k-1})\times(x_k - M_k) \qquad(exSumSqDev)</math>
  • We keep track of whether all exemplars in this leaf have the same domain. If they do, we know that comparisons on one exemplar apply to all exemplars in that leaf.
// These aren't initialised until add() is called.
private double[] exMean = null, exSumSqDev = null;

// Optimisation when sub-tree contains only duplicates
private boolean singularity = true;

// Optimisation for searches. This lets us skip a node if its
// scope intersects with a search hypersphere but it doesn't contain
// any points that actually intersect.
private double[] contentMax = null, contentMin = null;

private int
dimensions() { return contentMax.length; }

// Addition

// Adds an exemplar without splitting overflowing leaves.
// Returns leaf to which exemplar was added.
private static <X extends Exemplar> OptKdTree<X>
addNoSplit(OptKdTree<X> tree, X ex) {
  // Some spurious function calls. Optimised for readability over
  // efficiency.
  OptKdTree<X> cursor = tree;
  while (cursor != null) {
    updateBounds(cursor, ex);
    if (cursor.isTree()) {
      // Sub-tree
      cursor = ex.domain[cursor.splitDim] <= cursor.split
        ? cursor.left : cursor.right;
    } else {
      // Leaf

      // Add exemplar to leaf
      cursor.data.add(ex);

      // Calculate running mean and sum of squared deviations
      final int nExs = cursor.data.size();
      final int dims = cursor.dimensions();
      if (nExs == 1) {
        cursor.exMean = Arrays.copyOf(ex.domain, dims);
        cursor.exSumSqDev = new double[dims];
      } else {
        for(int d = 0; d < dims; d++) {
          final double coord = ex.domain[d];
          final double oldMean = cursor.exMean[d], newMean;
          cursor.exMean[d] = newMean =
            oldMean + (coord - oldMean)/nExs;
          cursor.exSumSqDev[d] = cursor.exSumSqDev[d]
            + (coord - oldMean)*(coord - newMean);
        }
      }

      // Check that data are still uniform
      if (cursor.singularity) {
        final Queue<X> cExs = cursor.data;
        if (cExs.size() > 0 && !ex.collocated(cExs.peek()))
          cursor.singularity = false;
      }

      // Finished walking
      return cursor;
    }
  }
  throw new RuntimeException("Walked tree without adding anything");
}

To update the bounding hyperrect for the contents of a tree, we iterate through each dimension, extending (if necessary) the bounds in that dimension to contain the new exemplar.

private static <T extends Exemplar> void
updateBounds(OptKdTree<T> tree, Exemplar ex) {
  final int dims = ex.domain.length;
  if (tree.contentMax == null) {
    tree.contentMax = Arrays.copyOf(ex.domain, dims);
    tree.contentMin = Arrays.copyOf(ex.domain, dims);
  } else {
    for(int d = 0; d < dims; d++) {
      final double dimVal = ex.domain[d];
      if (dimVal > tree.contentMax[d])
        tree.contentMax[d] = dimVal;
      else if (dimVal < tree.contentMin[d])
        tree.contentMin[d] = dimVal;
    }
  }
}

Splitting

We only split when a leaf's exemplars exceed its bucket size. It is only worth splitting if the exemplars don't all have the same domain.

private static <T extends Exemplar> boolean
shouldSplit(OptKdTree<T> tree) {
  return tree.data.size() > tree.bucketSize
    && !tree.singularity;
}

Thanks to our pre-computation, splitting is usually straight-forward. We iterate through each dimension to find the one with the largest variance (skip the unnecessary division), then we can directly look up the mean of that dimension. This strategy might not successfully split the tree if exemplars are so close together that rounding errors in computing the mean result in the mean not lying strictly between the exemplars in a leaf. When this happens, we resort to repeatedly trying a dimension and an exemplar at random for our splitting point. We only call split if the leaf is not a singularity, so eventually we will find a point that does divide our exemplars.

To make sure the point actually does divide our data, we separate our data into two lists destined for each sub-tree. Once we have found a successful split point, we bulk load our sub-trees, store information about our split point and let go of the exemplars stored in the tree.

// Split properties. Not initialised until split occurs.
private int splitDim = 0;
private double split = Double.NaN;

@SuppressWarnings("unchecked") private static <T extends Exemplar> void
split(OptKdTree<T> tree) {
  assert !tree.singularity;
  // Find dimension with largest variance to split on
  double largestVar = -1;
  int splitDim = 0;
  for(int d = 0; d < tree.dimensions(); d++) {
    // Don't need to divide by number of data to find largest
    // variance
    final double var = tree.exSumSqDev[d];
    if (var > largestVar) {
      largestVar = var;
      splitDim = d;
    }
  }

  // Find mean as position for our split
  double splitValue = tree.exMean[splitDim];

  // Check that our split actually splits our data. This also lets
  // us bulk load data into sub-trees, which is more likely
  // to keep optimal balance.
  final Queue<T> leftExs = new LinkedList<T>();
  final Queue<T> rightExs = new LinkedList<T>();
  for(T s : tree.data) {
    if (s.domain[splitDim] <= splitValue)
      leftExs.add(s);
    else
      rightExs.add(s);
  }
  int leftSize = leftExs.size();
  final int treeSize = tree.data.size();
  if (leftSize == treeSize || leftSize == 0) {
    System.err.println(
      "WARNING: Randomly splitting non-uniform tree");
    // We know the data aren't all the same, so try picking
    // an exemplar and a dimension at random for our split point

    // This might take several tries, so we copy our data to
    // an array to speed up process of picking a random point
    Object[] exs = tree.data.toArray();
    while (leftSize == treeSize || leftSize == 0) {
      leftExs.clear();
      rightExs.clear();

      splitDim = (int)
        Math.floor(Math.random()*tree.dimensions());
      final int splitPtIdx = (int)
        Math.floor(Math.random()*exs.length);
      // Cast is inevitable consequence of java's inability to
      // create a generic array
      splitValue = ((T)exs[splitPtIdx]).domain[splitDim];
      for(T s : tree.data) {
        if (s.domain[splitDim] <= splitValue)
          leftExs.add(s);
        else
          rightExs.add(s);
      }
      leftSize = leftExs.size();
    }
  }

  // We have found a valid split. Start building our sub-trees
  final OptKdTree<T> left = new OptKdTree<T>(tree.bucketSize);
  final OptKdTree<T> right = new OptKdTree<T>(tree.bucketSize);
  left.addAll(leftExs);
  right.addAll(rightExs);

  // Finally, commit the split
  tree.splitDim = splitDim;
  tree.split = splitValue;
  tree.left = left;
  tree.right = right;

  // Let go of data (and their running stats) held in this leaf
  tree.data.clear();
  tree.exMean = tree.exSumSqDev = null;
}

Distance calculations

Before we can start searching, we need to define our distance metric. The distance calculation has already been introduced before.

Squared Euclidean distance:
<math>distance_{x\rightarrow y}^2 = \sum (x[d] - y[d])^2</math>

This will be called for every point examined, so we should optimise it well.

private static double
distanceSqFrom(double[] p1, double[] p2) {
  double dSq = 0;
  for(int d = 0; d < p1.length; d++) {
    final double dst = p1[d] - p2[d];
    if (dst != 0)
      dSq += dst*dst;
  }
  return dSq;
}

We will also need to find the minimum distance from a hyperrect to a query point. This will let us rule out trees whose content hyperrects are outside our search sphere. To find the minimum distance between a hyperrect and a given point, we need first find the closest point along bounds of that hyperrect to the point. This is easy to do, even if not immediately obvious.

Along each dimension,
<math>nearestPoint[d]=\begin{cases}

min[d] & target[d]\le min[d]\\ target[d] & min<target[d]<max[d]\\ max[d] & target[d]\ge max[d]\end{cases}</math>

After that, we just calculate the distance from our query point to this nearest point. The code below has been optimised because it gets called a lot during a search.

// Gets distance from target of nearest point on hyper-rect defined
// by supplied min and max bounds
private static double
minDistanceSqFrom(double[] target, double[] min, double[] max) {
  double distanceSq = 0;
  for(int d = 0; d < target.length; d++) {
    if (target[d] < min[d]) {
      final double dst = min[d] - target[d];
      distanceSq += dst*dst;
    } else if (target[d] > max[d]) {
      final double dst = max[d] - target[d];
      distanceSq += dst*dst;
    }
  }
  return distanceSq;
}

Searching

Before we dive into searching, we will introduce two helper classes.

  • SearchStackEntry stores a tree along any data which will be useful to keep on the search stack. Currently, this additional information consists only of whether a bounds check should be done. This lets us skip a redundant bounds check on the root of the tree.
//
// class SearchStackEntry
//

private static class SearchStackEntry<T extends Exemplar> {
  public final boolean needBoundsCheck;
  public final OptKdTree<T> tree;

  public SearchStackEntry(boolean needBoundsCheck, OptKdTree<T> tree) {
    this.needBoundsCheck = needBoundsCheck;
    this.tree = tree;
  }
}
  • SearchState stores the current state of the search. Its sole purpose is to let us separate searching sub-trees and leaves into different methods. This breaks up a long method and makes profiling easier.
//
// class SearchState
//
// Holds data about current state of the search. Used for live updating
// of pruning distance.

private static class SearchState<X extends Exemplar> {
  final int nResults;
  final PriorityQueue<PrioNode<X>> results;

  public SearchState(int nResults) {
    this.nResults = nResults;
    results = new PriorityQueue<PrioNode<X>>(nResults,
      new Comparator<PrioNode<X>>() {

        public int
        compare(PrioNode<X> o1, PrioNode<X> o2) {
          return o1.priority == o2.priority ? 0
            : o1.priority > o2.priority ? -1 : 1;
        }

      });
  }
}

Our search method does a depth-first search using a stack. Each time we pop our stack, we do a bounds check if necessary, then defer to the appropriate method.

public Iterable<PrioNode<X>>
search(double[] query, int nResults) {
  // Forward to a static method to avoid accidental reference to
  // instance variables while descending the tree
  return search(this, query, nResults);
}

private static <X extends Exemplar> Iterable<PrioNode<X>>
search(OptKdTree<X> tree, double[] query, int nResults) {
  final SearchState<X> state = new SearchState<X>(nResults);
  final Deque<SearchStackEntry<X>> stack =
    new LinkedList<SearchStackEntry<X>>();
  if (tree.contentMin != null)
    stack.addLast(new SearchStackEntry<X>(false, tree));
  while (!stack.isEmpty()) {
    final SearchStackEntry<X> entry = stack.removeLast();
    final OptKdTree<X> cur = entry.tree;

    if (entry.needBoundsCheck && state.results.size() >= nResults) {
      final double d = minDistanceSqFrom(query,
        cur.contentMin, cur.contentMax);
      if (d > state.results.lastKey())
        continue;
    }

    if (cur.isTree()) {
      searchTree(query, cur, stack, state.results.peek().priority);
    } else {
      searchLeaf(query, cur, state);
    }
  }

  return state.results;
}

Searching a Tree

We basically want to add both sub-trees to the stack, if they are not empty. As a heuristic, we think the sub-tree on the same side of the split as our query point is more likely to contain closer points, so we search that sub-tree first by adding it to the stack last. You might try to avoid a distance calculation by not requiring a bounds check for the heuristically nearer sub-tree, but that does not seem to significantly improve performance.

private static <X extends Exemplar> void
searchTree(double[] query, OptKdTree<X> tree,
  Deque<SearchStackEntry<X>> stack, double maxPriority)
{
  OptKdTree<X> nearTree = tree.left, farTree = tree.right;
  if (query[tree.splitDim] > tree.split) {
    nearTree = tree.right;
    farTree = tree.left;
  }

  // These variables let us skip empty sub-trees
  boolean nearEmpty = nearTree.contentMin == null;
  boolean farEmpty = farTree.contentMin == null;

  // Add nearest sub-tree to stack later so we descend it
  // first. This is likely to constrict our max distance
  // sooner, resulting in less visited nodes
  if (!farEmpty && minDistanceSqFrom(query,farTree.contentMin,farTree.contentMax) < maxPriority) {
    stack.addLast(new SearchStackEntry<X>(true, farTree));
  }

  if (!nearEmpty && minDistanceSqFrom(query,nearTree.contentMin,nearTree.contentMax) < maxPriority) {
    stack.addLast(new SearchStackEntry<X>(true, nearTree));
  }
}

Searching a Leaf

By the time we reach a leaf, there is little left to optimise away. We must go through each exemplar in turn to see if it is a nearest neighbour to our query point.

private static <X extends Exemplar> void
searchLeaf(double[] query, OptKdTree<X> leaf, SearchState<X> state) {
  double exD = Double.NaN;
  for(X ex : leaf.data) {
    if (!leaf.singularity || Double.isNaN(exD)) {
      exD = distanceSqFrom(query, ex.domain);
    }

    if (examine(exD, state)) {
      while (state.results.size() >= state.nResults) {
        state.results.poll();
      }

      state.results.offer(new PrioNode<X>(exD, ex));
    }
  }
}

private static <X extends Exemplar> boolean
examine(double distance, SearchState<X> state) {
  return state.results.size() < state.nResults
    || distance < state.results.peek().priority;
}

Evaluation

From the same benchmark run, our optimised tree obtains the following results:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #4 duyn's optimised kd-tree [0.0536]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

This is promising. We are a whole order of magnitude faster and finally in the same league as other kd-tree implementations on this wiki. We have not changed our API, so the following shim is only for completeness:

  • DuynOptKNNSearch.java
public class DuynOptKNNSearch extends KNNImplementation {
  final OptKdTree<StringExemplar> tree;
  public DuynOptKNNSearch(final int dimension) {
    super(dimension);
    tree = new OptKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's optimised kd-tree";
  }

  @Override
  public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

A Better Results Heap

A TreeMap does unnecessary processing, keeping all its keys in order when all we really care about is the farthest tentative result we have found. By implementing a custom heap, we can squeeze even more performance out of our tree.

In this section, we will implement an implicit binary heap. This structure does not have the best theoretical performance, but often does well in practice. More theoretically efficient heaps often rely on batching inserts, only fully processing them when a the top element is removed. Since we almost do one removal for every insert, this does not offer much benefit to us. For our usage pattern, the only heap faster than a binary heap is a buggy heap.

Implicit d-ary Heap

Implicit binary heaps can be very memory efficient since all elements are stored in an array. They don't grow as easily as other heaps, but for our purposes that is not a problem—we can set a max on the number of results we are interested in. A good description of implicit heaps is provided by John Bentley in Programming Pearls: Thanks, Heaps.

Below, we implement a d-ary heap. The math is similar to a binary heap, and there are some suggestions that a 4-way heap may be more efficient than a binary heap. Our heap can be used as a min- or a max-heap. Using the enum Direction ensures the user doesn't get this parameter wrong.

// Final for performance, may be removed if sub-classing necessary
public final class FastBinaryHeap<T> implements Iterable<PrioNode<T>> {
  private final int branches;
  // Direction = -1 or 1
  // root * direction > element * direction
  private final int direction;
  public static enum Direction {
    MIN(-1), MAX(1);

    public final int value;
    private Direction(int value) {
      this.value = value;
    }
  }
  public static final Direction MAX = Direction.MAX, MIN = Direction.MIN;

  private final PrioNode<T>[] data;
  private int size = 0;
  private static final int DEFAULT_N_CHILDREN = 4;

  public FastBinaryHeap(int capacity, Direction direction) {
    this(capacity, DEFAULT_N_CHILDREN, direction);
  }

  @SuppressWarnings("unchecked")
  public FastBinaryHeap(int capacity, int branches, Direction dir) {
    this.branches = branches;
    // This implicit conversion is safe because we only refer to
    // this array through our data member.
    this.data = new PrioNode[capacity];
    this.direction = dir.value;
  }
}

We implement an interface similar to the standard PriorityQueue:

public Iterator<PrioNode<T>>
iterator() {
  return new Iterator<PrioNode<T>>() {

    int idx = 0;

    public boolean
    hasNext() { return idx < size; }

    public PrioNode<T>
    next() { return data[idx++]; }

    public void
    remove() {
      throw new UnsupportedOperationException("Not supported.");
    }

  };
}

public int
size() { return size; }

public PrioNode<T>
peek() { return data[0]; }

Our offer() method takes a priority separately from the data we want to associate with it. We will store the priority separately so we don't fall into the same performance trap as PriorityQueue. We do not implement a poll() method because we don't need it. If we are about to overflow, we will just replace our top element with the new one.

public boolean
offer(double priority, T result) {
  if (size < data.length) {
    // Trivial case
    siftUp(size, new PrioNode<T>(priority, result));
    size++;
    return true;
  }

  if (priority*direction > data[0].priority*direction) {
    System.err.printf("Do not want %f when have %f\n",
      priority, data[0].priority);
    return false;
  }

  siftDown(0, new PrioNode<T>(priority, result));
  return true;
}

Now to the meat of our heap.

  • siftUp treats the given index as empty and progressively movies parents into the hole left behind if inserting the given element there would violate heap order.
Parents are calculated as follows:
<math>parent[i] = \left\lfloor {i - 1\over branches} \right\rfloor</math>
Once we have that, we can compare the values at both indexes. Multiplying them by direction lets us use the same code for both a min- and a max-heap.
private void
siftUp(int hole, PrioNode<T> value) {
  if (size > 0) {
    final double vPD = value.priority*direction;
    while (hole > 0) {
      final int parent = (hole - 1)/branches;
      if (vPD > data[parent].priority*direction) {
        data[hole] = data[parent];
        hole = parent;
      } else {
        break;
      }
    }
  }

  data[hole] = value;
}
  • siftDown does the opposite, starting from the given index and moving elements up if inserting the new value into the hole would violate heap order. It swaps the element with its worst heap-offending child so that with a single swap, it can restore heap order to that level.
In a 0-based index, children are calculated as follows:
<math>\begin{align}child_{1}[i] & =branches\times i+1\\

\ldots\\ child_{branches}[i] & =branches\times i+branches\end{align}</math>

private void
siftDown(int hole, PrioNode<T> value) {
  if (size > 0) {
    // Push down along min path
    final double vPD = value.priority*direction;
    int firstChild;
    while ((firstChild = branches*hole + 1) < data.length) {

      // Find largest/smallest child
      double swapPD = data[firstChild].priority*direction;
      int swapChild = firstChild;
      for(int c = firstChild + 1;
        c < firstChild + branches && c < size; c++)
      {
        final double cPD = data[c].priority*direction;
        if (cPD > swapPD) {
          swapPD = cPD;
          swapChild = c;
        }
      }

      if (swapPD > vPD) {
        data[hole] = data[swapChild];
        hole = swapChild;
      } else {
        break;
      }
    }
  }

  data[hole] = value;
}

Both of these are slightly different from standard textbook presentations of the algorithm. The versions here do less swaps by shifting other elements out of the way until the new element's rightful place has been found.

Changes to our Tree

Because the only change we have made is to the method of collecting results, we will skim over the changes necessary.

public final class FastKdTree<X extends Exemplar> {
....
}
  • Our SearchState will have to be updated to use our new heap:
//
// class SearchState
//
// Holds data about current state of the search. Used for live updating
// of pruning distance.

private static class SearchState<X extends Exemplar> {
  final int nResults;
  final FastBinaryHeap<X> results;

  public SearchState(int nResults) {
    this.nResults = nResults;
    results = new FastBinaryHeap<X>(
      nResults, 4, FastBinaryHeap.MAX);
  }
}
  • Our searchLeaf() method no longer needs to explicitly poll the top since our heap will just replace it if it has no space:
private static <X extends Exemplar> void
searchLeaf(double[] query, FastKdTree<X> leaf, SearchState<X> state) {
  double exD = Double.NaN;
  for(X ex : leaf.data) {
    if (!leaf.singularity || Double.isNaN(exD)) {
      exD = distanceSqFrom(query, ex.domain);
    }

    if (examine(exD, state)) {
      state.results.offer(exD, ex);
    }
  }
}
  • We have also replaced all instances of LinkedLists with ArrayDeques. This is because although theoretically LinkedLists are appropriate, the javadocs for ArrayDeque state that:
"This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue."—Java 6 API docs
public FastKdTree(int bucketSize) {
  this.bucketSize = bucketSize;
  this.data = new ArrayDeque<X>();
}

@SuppressWarnings("unchecked") private static <X extends Exemplar> void
split(FastKdTree<X> tree) {
  ...
  final Queue<X> leftExs = new ArrayDeque<X>();
  final Queue<X> rightExs = new ArrayDeque<X>();
  ...
}

private static <X extends Exemplar> Iterable<PrioNode<X>>
search(FastKdTree<X> tree, double[] query, int nResults) {
  final SearchState<X> state = new SearchState<X>(nResults);
  final Deque<SearchStackEntry<X>> stack =
    new ArrayDeque<SearchStackEntry<X>>();
  ...
}

Evaluation

Our improved heap squeezes a bit more performance (~10%) out of our tree:

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

[...]

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
: Average searching time       = 0.0479 miliseconds
: Average worst searching time = 25.0995 miliseconds
: Average adding time          = 2.1288 microseconds
: Accuracy                     = 100.0%

[...]

BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
[...]
 - #3 duyn's fast kd-tree [0.0479]
 - #4 duyn's optimised kd-tree [0.0536]
[...]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

Our client code has barely changed:

  • DuynFastKNNSearch.java
public class DuynFastKNNSearch extends KNNImplementation {
  final FastKdTree<StringExemplar> tree;
  public DuynFastKNNSearch(final int dimension) {
    super(dimension);
    tree = new FastKdTree<StringExemplar>();
  }

  @Override public void
  addPoint(final double[] location, final String value) {
    tree.add(new StringExemplar(location, value));
  }

  @Override public String
  getName() {
    return "duyn's fast kd-tree";
  }

  @Override public KNNPoint[]
  getNearestNeighbors(final double[] location, final int size) {
    final List<KNNPoint> justPoints = new LinkedList<KNNPoint>();
    for(PrioNode<StringExemplar> sr : tree.search(location, size)) {
      final double distance = sr.priority;
      final StringExemplar pt = sr.data;
      justPoints.add(new KNNPoint(pt.value, distance));
    }
    final KNNPoint[] retVal = new KNNPoint[justPoints.size()];
    return justPoints.toArray(retVal);
  }

  class StringExemplar extends Exemplar {
    public final String value;
    public StringExemplar(final double[] coords, final String value) {
      super(coords);
      this.value = value;
    }
  }
}

Full Benchmark Results

Below are the full benchmark results quoted throughout this page. It puts the performance of our trees in context compared to other kd-trees available on this wiki. The benchmark (k-NN algorithm benchmark) was run with the Diamond vs CunobelinDC gun data.

K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
-----------------------------------------
Reading data from gun-data-Diamond-vs-jk.mini.CunobelinDC 0.3.csv.gz
Read 25621 saves and 10300 searches.

Running 30 repetition(s) for k-nearest neighbours searching:
:: 13 dimension(s); 40 neighbour(s)
Warming up the JIT with 5 repetitions first...

Running tests...test 1/30
[...]
 COMPLETED.

RESULT << k-nearest neighbours search with Voidious' Linear search >>
: Average searching time       = 0.5740 miliseconds
: Average worst searching time = 2.3482 miliseconds
: Average adding time          = 0.4528 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Average searching time       = 0.1609 miliseconds
: Average worst searching time = 1.4184 miliseconds
: Average adding time          = 1.3322 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Average searching time       = 0.2081 miliseconds
: Average worst searching time = 21.9191 miliseconds
: Average adding time          = 1.0301 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (2nd gen) >>
: Average searching time       = 0.0378 miliseconds
: Average worst searching time = 0.2625 miliseconds
: Average adding time          = 1.5687 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Rednaxela's kd-tree (3rd gen) >>
: Average searching time       = 0.0407 miliseconds
: Average worst searching time = 0.2259 miliseconds
: Average adding time          = 1.6342 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's basic kd-tree >>
: Average searching time       = 0.3939 miliseconds
: Average worst searching time = 11.5167 miliseconds
: Average adding time          = 0.5165 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's optimised kd-tree >>
: Average searching time       = 0.0536 miliseconds
: Average worst searching time = 1.6742 miliseconds
: Average adding time          = 2.9380 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with duyn's fast kd-tree >>
: Average searching time       = 0.0479 miliseconds
: Average worst searching time = 25.0995 miliseconds
: Average adding time          = 2.1288 microseconds
: Accuracy                     = 100.0%

RESULT << k-nearest neighbours search with Chase-san's kd-tree (KDTreeC) >>
: Average searching time       = 0.1200 miliseconds
: Average worst searching time = 19.5727 miliseconds
: Average adding time          = 1.4222 microseconds
: Accuracy                     = 100.0%


BEST RESULT: 
 - #1 Rednaxela's kd-tree (2nd gen) [0.0378]
 - #2 Rednaxela's kd-tree (3rd gen) [0.0407]
 - #3 duyn's fast kd-tree [0.0479]
 - #4 duyn's optimised kd-tree [0.0536]
 - #5 Chase-san's kd-tree (KDTreeC) [0.1200]
 - #6 Simonton's Bucket PR k-d tree [0.1609]
 - #7 Voidious' Bucket PR k-d tree [0.2081]
 - #8 duyn's basic kd-tree [0.3939]
 - #9 Voidious' Linear search [0.5740]

Benchmark running time: 587.77 seconds

Full Source Code

The full source code for all three trees presented in this tutorial can be downloaded together: File:Duyn tutorial kd trees.jar. The code is available under the RWPCL.

Extensions

This tree does not include any support for deletion or re-balancing. You can get away with this if you plan to re-build the entire tree regularly.

Deletion support would be easy to add to this tree. Since all exemplars are stored in leaves at the bottom, we need only walk the tree and remove the exemplar from the leaf containing it. We would then merge sub-trees if multiple deletions has left one of them empty. Unless our bucket width is really small, we don't need to merge sub-trees after each deletion since the addition of a subsequent exemplar is unlikely to result in choosing a significantly different split point.

Re-balancing with this tree is trickier. It does not currently store depth, so we don't know when a re-balance is needed. Because exemplars are only stored in leaves, collecting all the exemplars under a sub-tree would require exhaustively walking that sub-tree.

  • If a depth is to be stored, each tree would need a pointer to its parent tree so we propogate depth updates when a leaf is split.
  • We can avoid an exhaustive walk by storing each exemplar in every tree on its path from root to leaf. We then can re-balance easily by dropping both sub-trees and splitting the given tree anew. This approach has the down side of making inserts and deletes more complex since we have to update the exemplar list of every tree along the path from root to leaf.

See also