User talk:AW/kD-Tree

From Robowiki
< User talk:AW
Revision as of 02:41, 25 April 2011 by Rednaxela (talk | contribs) (reply, things to watch out for)
Jump to navigation Jump to search

Happy Easter! I have written my recursionless tree, but I am getting unexpected results (My tree is faster than rednaxela's by an unbelievable margin, and the results are different in robocode) I haven't been able figure this out and I am wondering if I am using rednaxela's tree incorrectly? Here is my tester class:

package tree;

import ags.utils.KdTree.SqrEuclid;

public class KDTreeTester {
	static KDTree gunTree = new KDTree(8);
	static SqrEuclid<double[]> AgsTree = new SqrEuclid<double[]>(8, 400000);
	
	public static void main(String[] args) {
		int numOfPoints = 40000;
		long startTime = System.nanoTime();
		 for(int i = 1; i < numOfPoints; i++) {
			 DataPoint addPoint = new DataPoint(8, i);
			 addPoint.setCoordinates(0, Math.random());
			 addPoint.setCoordinates(1, Math.random());
			 addPoint.setCoordinates(2, Math.random());
			 addPoint.setCoordinates(3, Math.random());
			 addPoint.setCoordinates(4, Math.random());
			 addPoint.setCoordinates(5, Math.random());
			 addPoint.setCoordinates(6, Math.random());
			 addPoint.setCoordinates(7, Math.random());
			 gunTree.addPoint(addPoint);
			 }
			 DataPoint SearchPoint = new DataPoint(8, 50);
			 SearchPoint.setCoordinates(0, 0.2);
			 SearchPoint.setCoordinates(1, 0.1);
			 SearchPoint.setCoordinates(2, 0.6);
			 SearchPoint.setCoordinates(3, 0.9);
			 SearchPoint.setCoordinates(4, 0.2);
			 SearchPoint.setCoordinates(5, 0.7);
			 SearchPoint.setCoordinates(6, 0.3);
			 SearchPoint.setCoordinates(7, 0.5);
			
			 System.out.println("time elapsed building = " + ((System.nanoTime() -
			 startTime) * (1E-6)) + " milliseconds");
			 startTime = System.nanoTime();
//			  out.println(gunTree.getNearestNeighbor(SearchPoint).angle);
			 
//			 out.println(gunTree.getNearestNeighbor(SearchPoint).getDistance(SearchPoint));
			
			 gunTree.getNearestNeighbor(SearchPoint);
			 System.out.println("time elapsed searching= " + ((System.nanoTime() -
			 startTime) * (1E-6)));
			 
			 
//			 
//			 
//			 
			 startTime = System.nanoTime();
				
			 for(int i = 1; i < numOfPoints; i++) {
			 double[] addPoint = new double[8];
			 addPoint[0] = Math.random();
			 addPoint[1] = Math.random();
			 addPoint[2] = Math.random();
			 addPoint[3] = Math.random();
			 addPoint[4] = Math.random();
			 addPoint[5] = Math.random();
			 addPoint[6] = Math.random();
			 addPoint[7] = Math.random();
			 
			 double[] trash = new double[1];
			trash[0] = 0.589;
			 
			 AgsTree.addPoint(addPoint, trash);
			 }
			
			 double[] AgsSearchPoint = new double[8];
			 AgsSearchPoint[0] = 0.4;
			 AgsSearchPoint[1] = 0.5;
			 AgsSearchPoint[2] = 0.8;
			 AgsSearchPoint[3] = 0.2;
			 AgsSearchPoint[4] = 0.4;
			 AgsSearchPoint[5] = 0.2;
			 AgsSearchPoint[6] = 0.1;
			 AgsSearchPoint[7] = 0.9;
			
			
			 System.out.println("time elapsed building = " + ((System.nanoTime() -
			 startTime) * (1E-6)) + " milliseconds");
			 startTime = System.nanoTime();
//			  out.println(gunTree.getNearestNeighbor(SearchPoint).angle);
//			 
//			 out.println(gunTree.getNearestNeighbor(SearchPoint).getDistance(SearchPoint));
			
			 AgsTree.nearestNeighbor(AgsSearchPoint, 1, false);
			 System.out.println("time elapsed searching= " + ((System.nanoTime() -
			 startTime) * (1E-6)));
	}
}

Thanks and God bless you, --AW 21:22, 24 April 2011 (UTC)

Well, you're not using my tree incorrectly except that "new SqrEuclid<double[]>(8, 400000);" should probably be "new SqrEuclid<double[]>(8, null);". The second parameter is only used when size-limited trees are desired, and that has extra processing overhead. That probably doesn't make a big different though.

You are however only timing one single run of the search, and that's rather poor test methodology, particularly with how Java's JIT compiler works. The first run of any piece of code will always be slow, because Java's JIT compiler only optimizes after later runs of methods. I don't expect my code to perform well when testing with such an unrealistically small number of searches. I'd highly suggest running the code from the "get source" link here. The framework has been well-tested to test both speed and accuracy of trees in conditions similar to normal use in Robocode. It also has code to allow two modes:

  1. Run "dummy" runs that "don't count" first, to let the JIT complier finish with everything. This eliminates the effect of the JIT's delay from the test
  2. Run each test iteration in a new JVM. This ensures every iteration is equally influenced by the JIT on average.

When benchmarking in Java, one really needs to be careful to consider the influence of the JIT compiler, as it can radically sway the results. Even if it weren't for that you still need thousands of searches for an accurate speed measurement.

I'd also wonder if you've tested the accuracy of your kd-tree. It can be very easy to have some kinds of bugs that dramatically improve speed but lead to incorrect output on occasion.

--Rednaxela 00:41, 25 April 2011 (UTC)

You cannot post new threads to this discussion page because it has been protected from new threads, or you do not currently have permission to edit.

Contents

Thread titleRepliesLast modified
i don't see the KD-TREE116:08, 14 October 2013
Hashmaps vs storing data in the tree515:15, 26 July 2012

i don't see the KD-TREE

i don't see the KD-TREE

Tmservo (talk)22:54, 13 October 2013

The tree's code is in Gilgalad (these tests weren't run with the latest version anyways)

AW (talk)16:08, 14 October 2013
 

Hashmaps vs storing data in the tree

I was working on my kd-tree and one of the changes I made was to store the data in the tree rather than in a hashmap, thinking that this would save time. However, in my benchmarks, it is much faster to use a hashmap. Does anyone know why this would be the case?

AW19:17, 25 July 2012

It's very strange question because hashmap and kd tree is absolutly different structures with different aims and contract and for sure map faster because it's nature. May be you publish source code with usage of map and tree?

Jdev19:38, 25 July 2012
 

O, looks like i misunderstand you. Do you mean why HashMap is faster than TreeMap? I'm not sure but i think, that hash map has efficiency O(1) but tree map O(log N) because tree map are sorted and based on red-black tree. I do not describe how they works because my english skill...

Jdev20:39, 25 July 2012
 

What I mean is that it seems slower to store the data with the point in a KDTree than it does to store the point in the tree and then use the point as the key for the hashmap. So for example you would have:

PointEntry entry = new PointEntry(pointCoordinates, DataObject); tree.add(entry);

instead of

hashmap.add(pointCoordinates, DataObject); tree.add(pointCoordinates);

Maybe I have some typecasting in the tree that is slowing it down?

AW23:17, 25 July 2012
 

Ok, i understand yours problem. But now i have not advices:) can you publish kdtree? Is difference only in type of data stored in kdtree? What type has pointCoordinates?

Jdev04:56, 26 July 2012
 

IIRC, my experience on this matter is that it's faster to use HashMap than to store PointEntry objects that have both pointCoordinates and DataObject, but it's faster than either to in your leaf nodes store two arrays, one for the "pointCoordinates" values and one for the "dataObject" values.

I'm pretty sure the reason for this is that with both using a HashMap for data objects, and using two separate arrays for coordinates and data, you avoid some pointer dereferencing. Basically, it's bad for performance if when comparing the location of a point, you have to dereference a PointEntry object and the coordinates themselves, rather than just the coordinates.

One of the important things to remember when dealing with Java, is that unlike some other languages (i.e. C++) every object is another pointer you have to dereference, which makes an array of primitives faster than an array of objects, and also makes two arrays of objects faster than one array of objects that store two objects each.

Rednaxela14:48, 26 July 2012