Talk:K-NN algorithm benchmark

(Continue from Talk:Kd-Tree#Bucket_PR_k-d_tree)

Thank you, I have changed my local implementation to fixed your. Just found some bug in the benchmark, all trees except Voidious' and Rednaxela's does full Euclidian distance. Now it use squared distance, but the time didn't effect much. And just found some bug in my tree that make it run a little faster, or not? Anyway, my benchmarks is now running... with large input set(15,400000,150) and 50 tests at once. Will post new result and upload new version soon. It is now fully automated and get the input size/dimensions/neighbours from argv. » Nat | Talk » 15:03, 17 August 2009 (UTC)

NAT'S K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
------------------------------------------------
Running 50 test(s) for k-nearest neighbours search:
:: 15 dimension(s); 400000 data point(s); 150 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 478.0353 miliseconds
: Average adding time             = 1.359 microseconds
: Average last node adding time   = 2.406 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 776.3058 miliseconds
: Best case used time             = 423.6457 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's k-d tree >>
: Averaged used time              = 318.4575 miliseconds
: Average adding time             = 14.808 microseconds
: Average last node adding time   = 13.042 microseconds
: Averaged  accuracy              = 83%
: Worst case used time            = 453.7179 miliseconds
: Best case used time             = 157.7133 miliseconds

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Averaged used time              = 78.3186 miliseconds
: Average adding time             = 3.446 microseconds
: Average last node adding time   = 4.113 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 247.4606 miliseconds
: Best case used time             = 41.4053 miliseconds

RESULT << k-nearest neighbours search with Nat's Bucket PR k-d tree >>
: Averaged used time              = 193.9578 miliseconds
: Average adding time             = 3.556 microseconds
: Average last node adding time   = 3.515 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 805.729 miliseconds
: Best case used time             = 41.7864 miliseconds

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Averaged used time              = 83.3344 miliseconds
: Average adding time             = 4.1 microseconds
: Average last node adding time   = 3.832 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 96.4601 miliseconds
: Best case used time             = 79.8629 miliseconds


BEST RESULT: 
 - #1 Simonton's Bucket PR k-d tree [78.3186]
 - #2 Voidious' Bucket PR k-d tree [83.3344]
 - #3 Nat's Bucket PR k-d tree [193.9578]
 - #4 Rednaxela's k-d tree [318.4575]
 - #5 flat/linear searching [478.0353]

Even though Simonton's is the fastest averaged, it has quite large worst case. Voidious' the best here. And it seems that my tweak make it slower... My worst case is larger than linear worst case. This test run for 15 minutes on my machine so if you run it yourself please be patient. » Nat | Talk » 15:16, 17 August 2009 (UTC)

It might be more useful to use 6-10 dimensions and 30-50 neighbors, as I think that's more common for DC guns. With 15/150, the brute force is not even that slow, and would probably outperform the kd-trees in a normal battle (~25,000 data points). Over 400,000 data points with 6 dimensions / 30 neighbors, the kd-tree will be waaay faster than brute force. Also, I wonder what Simonton is using for bucket size, as that impacts the speed, too. It might be worth modifying the Bucket PR kd-trees to use the same bucket size for fair comparisons. Glad to see mine is looking reliable. =) --Voidious 15:32, 17 August 2009 (UTC)

One note about bucket size: The optimal bucket size would depend on implementation details somewhat, so I think that if they are made to use the same bucket size, it should be tested at a variety of different bucket sizes. Also, not that I care much for my old/crappy/inefficient/inaccurate tree, but it is also a bucket variant Nat, despite what it's title in the tests implies. --Rednaxela 15:42, 17 August 2009 (UTC)

Oh, I forget that. I use bucket size of 8 in Simonton's tree. I'll re-run it this evening. Note that 478ms is only half a second! But I wonder, I haven't checked yet, which distance does Rednaxela's tree use? I'll change my tree to binary, change to 8 buckets, check Voidious' and Rednaxela's (is Rednaxela's bucketPR or just plain K-d tree?) and apply final speed update to my tree and run the test again with suggested neighbours/dimensions. » Nat | Talk » 07:18, 18 August 2009 (UTC)

ARGH!!!

Simonton's : 8 buckets since he told that bucket size of 8-16 is the best.
Voidious's : 32
Rednaxela's : 20
Nat's : 22 due a bit of test result with difference m-ary and bucket size.

Will change to 10 for all. Result next. » Nat | Talk » 11:47, 18 August 2009 (UTC)

Having problem =( Trying to do configurable bucket size, and result in memory leak plus several exception and main NPE with Rednaxela's... I think I just revert and just change a constant...

Here is result from 10 tests:

NAT'S K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
------------------------------------------------
Running 10 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 400000 data point(s); 45 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 503.8392 miliseconds
: Average adding time             = 1.379 microseconds
: Average last node adding time   = 2.409 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 776.594 miliseconds
: Best case used time             = 410.977 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's k-d tree >>
: Averaged used time              = 12.7223 miliseconds
: Average adding time             = 15.628 microseconds
: Average last node adding time   = 10.357 microseconds
: Averaged  accuracy              = 47%
: Worst case used time            = 46.5419 miliseconds
: Best case used time             = 5.269 miliseconds

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Averaged used time              = 3.3432 miliseconds
: Average adding time             = 3.734 microseconds
: Average last node adding time   = 3.422 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 9.7065 miliseconds
: Best case used time             = 1.2692 miliseconds

RESULT << k-nearest neighbours search with Nat's Bucket PR k-d tree >>
: Averaged used time              = 64.9766 miliseconds
: Average adding time             = 3.485 microseconds
: Average last node adding time   = 3.289 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 603.5304 miliseconds
: Best case used time             = 2.4253 miliseconds

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Averaged used time              = 6.5825 miliseconds
: Average adding time             = 4.049 microseconds
: Average last node adding time   = 3.54 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 14.8861 miliseconds
: Best case used time             = 3.835 miliseconds


BEST RESULT: 
 - #1 Simonton's Bucket PR k-d tree [3.3432]
 - #2 Voidious' Bucket PR k-d tree [6.5825]
 - #3 Rednaxela's k-d tree [12.7223]
 - #4 Nat's Bucket PR k-d tree [64.9766]
 - #5 flat/linear searching [503.8392]

Benchmark running time: 67.82 seconds

Sorry, I didn't have enough patient to run it for enough test. But, just thinking, this is "K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK", not "BUCKET PR K-D TREE BENCHMARK" so I think I will use the default configuration for each tree. Any comment? » Nat | Talk » 12:21, 18 August 2009 (UTC)

I'd say that the trees should be kept in default configuration, unless a provably better configuration for that tree is found, in which case that should be used and documented. What configuration (i.e. bucket size) is best depends on the particular implementation, and chances are the default configuration is decent, but if we find a better configuration for a particular one there's no reason not to use it and let the author know. --Rednaxela 14:39, 18 August 2009 (UTC)

Since I kept a whole lot of points in both the tree and array in my KNNRunner, last night I run the test before I slept so my computer didn't run any other programs. Today, now, I run a lot of other programs/applications so... I just run another test and get OutOfMemoryError with -Xmx1536M... I wonder if in my last change with my tree create some memory leak... I couldn't identify it anyway... But overall I have finished my own worked K-d tree so I think I will just use Voidious' in my next robot. I've already Learned this algorithm anyway. » Nat | Talk » 12:42, 18 August 2009 (UTC)

Hmm... well I have a better tree almost ready. There are still some considerable improvements I still plan to planned to make, but I'm pretty sure it currently outperforms anything else that's been benchmarked here... Full benchmark results pending. Hint: TAoR :) --Rednaxela 06:48, 20 August 2009 (UTC)

Oh, just as I finish typing the benchmark finishes:

NAT'S K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
------------------------------------------------
Running 10 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 400000 data point(s); 45 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 656.5265 miliseconds
: Average adding time             = 3.289 microseconds
: Average last node adding time   = 4.476 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 1068.1753 miliseconds
: Best case used time             = 460.1359 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's k-d tree >>
: Averaged used time              = 24.5219 miliseconds
: Average adding time             = 14.996 microseconds
: Average last node adding time   = 10.993 microseconds
: Averaged  accuracy              = 60%
: Worst case used time            = 84.1178 miliseconds
: Best case used time             = 5.0381 miliseconds

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Averaged used time              = 14.1402 miliseconds
: Average adding time             = 6.581 microseconds
: Average last node adding time   = 6.097 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 64.0482 miliseconds
: Best case used time             = 1.9734 miliseconds

RESULT << k-nearest neighbours search with Nat's Bucket PR k-d tree >>
: Averaged used time              = 36.161 miliseconds
: Average adding time             = 5.812 microseconds
: Average last node adding time   = 5.377 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 168.6711 miliseconds
: Best case used time             = 3.9539 miliseconds

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Averaged used time              = 8.8872 miliseconds
: Average adding time             = 7.051 microseconds
: Average last node adding time   = 7.892 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 20.6054 miliseconds
: Best case used time             = 3.1616 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's BETTER tree >>
: Averaged used time              = 8.4796 miliseconds
: Average adding time             = 5.032 microseconds
: Average last node adding time   = 6.467 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 48.2494 miliseconds
: Best case used time             = 1.6761 miliseconds


BEST RESULT: 
 - #1 Rednaxela's BETTER tree [8.4796]
 - #2 Voidious' Bucket PR k-d tree [8.8872]
 - #3 Simonton's Bucket PR k-d tree [14.1402]
 - #4 Rednaxela's k-d tree [24.5219]
 - #5 Nat's Bucket PR k-d tree [36.161]
 - #6 flat/linear searching [656.5265]

--Rednaxela 06:50, 20 August 2009 (UTC)

If you don't mind, I would like to see 100 or 50 tests result. But still 48.2494 > 20.6054. I have a strange habit to choose the algorithm by its worst case (so I prefer merge sort than quick sort =)). Anyway, its best case is better than Simonton's! Congrats. Do you mind sharing your secret? Oh! I just realised that milli- has 2 t... » Nat | Talk » 11:24, 20 August 2009 (UTC)

Yeah, the 10 test result isn't very precise I know, I'm going to run a longer one, but first I'm going to modify the test to use ThreadMXBean for timing, to ensure other things running on the system to not have a significant impact on the benchmark. And yes, that worst case time isn't too great but I have ideas of how to improve it notably hopefully. Additionally, that test above was with a bucket size of 50, and I've since found that a bucket size of 20 improves the result quite notably. I think there are still plenty of ways I can make it faster. My secret though? TAoR = Total Avoidance of Recursion :) --Rednaxela 12:14, 20 August 2009 (UTC)

Hm, spoke too soon. It seems that for this the resolution provided by ThreadMXBean is simply insufficent. In any case I'll be running more tests, and further improving my tree... :) --Rednaxela 12:28, 20 August 2009 (UTC)

Well, with a bucket size of 20, and 4000 data points (i.e. what it would be like a few rounds in a typical battle), I get the following result over 2000 rest iterations:

NAT'S K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
------------------------------------------------
Running 2000 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 4000 data point(s); 45 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 2.3658 miliseconds
: Average adding time             = 3.026 microseconds
: Average last node adding time   = 4.47 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 114.4847 miliseconds
: Best case used time             = 2.0449 miliseconds

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Averaged used time              = 0.4177 miliseconds
: Average adding time             = 3.636 microseconds
: Average last node adding time   = 5.186 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 7.3951 miliseconds
: Best case used time             = 0.2298 miliseconds

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Averaged used time              = 0.4614 miliseconds
: Average adding time             = 4.291 microseconds
: Average last node adding time   = 5.675 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 8.5026 miliseconds
: Best case used time             = 0.266 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's BETTER tree >>
: Averaged used time              = 0.2887 miliseconds
: Average adding time             = 3.344 microseconds
: Average last node adding time   = 4.769 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 4.997 miliseconds
: Best case used time             = 0.1492 miliseconds


BEST RESULT: 
 - #1 Rednaxela's BETTER tree [0.2887]
 - #2 Simonton's Bucket PR k-d tree [0.4177]
 - #3 Voidious' Bucket PR k-d tree [0.4614]
 - #4 flat/linear searching [2.3658]

I think that's pretty impressive... :) --Rednaxela 12:39, 20 August 2009 (UTC)

Pretty impressive? It should make a loads of robot run faster. I always think that my coding style in the benchmark is quite mess, especially the part I dynamically load the KNNImplementations, it seems that I was wrong. Anyway, if you want faster test, I don't think linear search is such that important. It would be nice if you can run 26,000 data points. TAoR? I never heard of that before. It does reduce overhead of recursion, doesn't it? I'll be surprise if your tree isn't complicated. Are you going to release it? The real reason I create my tree and this benchmark is that I'm seeking for the fastest tree to use in my next robot, and writing it at the same time make me understand it, the dependency of using other people code =) Are you going to release it by the way? I know you are going to use it in your new robot. I'm curious if this new, faster and more accurate tree does in RougeDC by the way. It may improved your ranking a bit. » Nat | Talk » 13:09, 20 August 2009 (UTC)

Reduce overhead of recursion? I don't use one bit of recursion at all in my new tree. As far as how complicated it is... let's just say it takes about the same number of lines of code as the Simonton tree which is the smallest in the benchmark by a fair margin. I'm going to post the code for my tree as soon as I get it tweaked a bit more and fix some ugly limitations (i.e. it currently will go into an infinite loop if it is given more than bucketSize points at the EXACT same location). Hopefully fixing it's current limitations will not degrade it's speed appreciably. I'm also very curious how RougeDC will do with it, so I do intend to make a re-release of the current RougeDC version with the tree replaced some time soon. Oh, and also, 1000 test iterations of 40,000 points:

NAT'S K-NEAREST NEIGHBOURS ALGORITHMS BENCHMARK
------------------------------------------------
Running 1000 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 40000 data point(s); 45 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 34.3651 miliseconds
: Average adding time             = 3.22 microseconds
: Average last node adding time   = 4.553 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 268.4178 miliseconds
: Best case used time             = 28.4429 miliseconds

RESULT << k-nearest neighbours search with Simonton's Bucket PR k-d tree >>
: Averaged used time              = 1.68 miliseconds
: Average adding time             = 4.458 microseconds
: Average last node adding time   = 6.153 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 24.9293 miliseconds
: Best case used time             = 0.426 miliseconds

RESULT << k-nearest neighbours search with Voidious' Bucket PR k-d tree >>
: Averaged used time              = 2.7167 miliseconds
: Average adding time             = 5.449 microseconds
: Average last node adding time   = 6.449 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 25.1165 miliseconds
: Best case used time             = 0.7738 miliseconds

RESULT << k-nearest neighbours search with Rednaxela's BETTER tree >>
: Averaged used time              = 1.1538 miliseconds
: Average adding time             = 4.07 microseconds
: Average last node adding time   = 5.697 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 15.4912 miliseconds
: Best case used time             = 0.3635 miliseconds


BEST RESULT: 
 - #1 Rednaxela's BETTER tree [1.1538]
 - #2 Simonton's Bucket PR k-d tree [1.68]
 - #3 Voidious' Bucket PR k-d tree [2.7167]
 - #4 flat/linear searching [34.3651]

--Rednaxela 13:19, 20 August 2009 (UTC)

I mean, all overheads from recursion were removed. » Nat | Talk » 14:05, 20 August 2009 (UTC)

Nice work, Rednaxela. I'm curious about a few things:

What bucket size are you using? Personally, I still believe optimal bucket size is more dependent on the data set than the tree implementation, so it'd be cool to see comparisons of equal sizes...
Does your tree outperform in every scenario, or are you just posting the most impressive results? =) Just kidding, but I am curious how much faster your tree is, in general.
One small point of note is that my tree is always supporting weights on the dimensions, too, which should slow it down (I'm not sure by how much), but is a necessary feature for me.
TAoR = "The Art of Rednaxela"? =)

I'm very surprised at how poorly the brute force search does in these tests, as my own benchmark shows something very different. I was very careful to optimize my brute force algorithm so that I wasn't getting inflated results for how much faster the kd-tree is. Here are the timing results (in nanoseconds) with the above settings for 10,000 searches on my system. (I don't time the adding, but that would be worse in a kd-tree, too.)

Time using bucket kd-tree:  16785897000
Time using regular kd-tree: 25061482000
Time using brute force:     19304366000

The brute force performs as well or better than the kd-tree through at least 25,000 scans, which is about the length of a normal 35-round battle. So with 45 neighbors/8 dimensions, you're slightly better off just using brute force than my tree. (Though I'd still use the tree because it's fast enough and lets you run 500-round battles.)

--Voidious 13:41, 20 August 2009 (UTC)

Like Nat noted below, using a bucket size of 20 since it seemed like a decent number. I plan to test other sizes, with all the trees perhaps. Currently the computer I'm away from is running 1000 test iterations with 400k points for sake of comparison so I'll be posting that later today. Personally I'm really interested in getting some real data from DC guns/surfing in order to give more useful benchmarks than these random distributions. As far as dynamic weights on dimensions, I will probably add that at some point at least to test with. My old tree had this feature but I never actually got around to using it. Haha, amusing interpretation of the acronym. --Rednaxela 14:50, 20 August 2009 (UTC)

I believed that Rednaxela use my File:KNN.jar so I should answer here. The result is an (straight) average from all tests. The best and worst case time are showed, as well. I think he said he use bucket size of 20, somewhere above.

Rednaxela, also please make sure that you do squared distance without any sqrt(). If you can, try to use the protected method getDistance(double[],double[]) in KNNImplementation. Simonton's use KNNImplemenation's, while Voidious' use tree default. But upon reading the code he use the square distance too.

My linear search is not very optimized, but basically do O(n log n). Just calculate all distances and merge sort.

package net.robothai.nat.knn.implementations;

import java.util.ArrayList;
import java.util.Collections;

import net.robothai.nat.knn.util.Comparator;
import net.robothai.nat.knn.util.KNNPoint;

public class FlatKNNSearch extends KNNImplementation {
	private ArrayList<Pair> points;

	public FlatKNNSearch(int dimension) {
		super(dimension);
		points = new ArrayList<Pair>();
	}

	@Override
	public void addPoint(double[] location, String value) {
		points.add(new Pair(value, location));
	}

	@Override
	public KNNPoint[] getNearestNeighbors(double[] location, int size) {
		ArrayList<Comparator<String>> lists = new ArrayList<Comparator<String>>(points.size());
		
		for (Pair p : points) {
			lists.add(new Comparator<String>(p.value, getDistance(location, p.location)));
		}
		
		Collections.sort(lists);
		
		KNNPoint[] result = new KNNPoint[size];
		
		for (int i = 0; i < size; i++) {
			result[i] = new KNNPoint(lists.get(i).getData(), lists.get(i).getValue());
		}
		
		return result;
	}

	private static final class Pair {
		public String value;
		public double[] location;

		public Pair(String value, double[] location) {
			super();
			this.value = value;
			this.location = location;
		}
	}

	@Override
	public String getName() {
		return "flat/linear searching";
	}
}

Time is measured from whole getNearestNeighbors() for used time.

Btw, I like "The Art of Rednaxela" =) » Nat | Talk » 14:05, 20 August 2009 (UTC)

I do use squared euclidean distance, though not the 'KNNImplementation' distance implementation. And... wow! That linear search really makes me shiver. Firstly, resorting the whole array instead of just getting the extreme values is a huge issue. Secondly, Comparators? The function call overhead of comparators is truly ridiculously huge for something like this, they were part of why my old tree was so slow. --Rednaxela 14:50, 20 August 2009 (UTC)

Ah, I see why your linear search would be much slower - you really don't need to sort all 40,000 points to find the top 45! You need to calculate all the distanceSq's, of course, but you don't need to calculate how the remaining 39955 should be sorted - that is a lot of extra comparisons. =) If you're interested, the brute force code from my runDiagnostics2 method should be pretty simple to adapt. It starts where it defines "double[][] bruteClosestPoints", and you'd also need to take the "findAndReplaceLongestDistanceSq" and "findLongestDistanceSq" methods. --Voidious 14:36, 20 August 2009 (UTC)

Much the same way in Simonton's (and almost?) tree are doing in the searching, I believed. But that cause me with wrong implementation in my old test case. But, does it really important? I mean, does the linear time really important in the benchmark? » Nat | Talk » 14:42, 20 August 2009 (UTC)

I don't know, maybe not to you guys. But I found it very important to compare my tree to a linear search. For instance, while Rednaxela's results for 4,000 data points are impressive, I believe a good brute force search would still be faster, which seems an important point to note. --Voidious 14:49, 20 August 2009 (UTC)

I'd call the linear time one extremely important Nat. It's yet another way to calculate KNN, and this is a KNN benchmark not a kd-tree benchmark (personally, I'm interested to see how LSH would compare), and like Voidious said above, linear search can often be faster than kd-trees when the number of data points is small and benchmarking that is quite critical. I'm going to for sure replace the ugly linear search with Voidious's nice one in my tests when I get home. --Rednaxela 14:50, 20 August 2009 (UTC)

OK, I have changed it. It is about 3 times faster now on 26,000 data points. File updated, with whole benchmark timing ability too.

Also, anyone, especially you, Rednaxela, please update KNN.jar once you change the engine code (if you add more algorithms, such as LSH, please update too if you don't mind). I'm posting my Ant build file. » Nat | Talk » 16:04, 20 August 2009 (UTC)

Umm, your linear search while notably improved is still quite slow Nat. I've adapted Voidious's brute force one and here's how that compares.

Running 50 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 1300 data point(s); 40 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 1.256 miliseconds
: Average adding time             = 3.273 microseconds
: Average last node adding time   = 4.777 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 26.4484 miliseconds
: Best case used time             = 0.2002 miliseconds

RESULT << k-nearest neighbours search with Voidious Linear Search >>
: Averaged used time              = 0.8681 miliseconds
: Average adding time             = 3.949 microseconds
: Average last node adding time   = 5.034 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 16.5177 miliseconds
: Best case used time             = 0.0777 miliseconds


BEST RESULT: 
 - #1 Voidious Linear Search [0.8681]
 - #2 flat/linear searching [1.256]

Benchmark running time: 0.80 seconds

Running 50 test(s) for k-nearest neighbours search:
:: 8 dimension(s); 26000 data point(s); 40 neighbour(s)


Running tests... COMPLETED.

RESULT << k-nearest neighbours search with flat/linear searching >>
: Averaged used time              = 7.3546 miliseconds
: Average adding time             = 3.161 microseconds
: Average last node adding time   = 4.591 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 61.8669 miliseconds
: Best case used time             = 5.3197 miliseconds

RESULT << k-nearest neighbours search with Voidious Linear Search >>
: Averaged used time              = 4.6598 miliseconds
: Average adding time             = 3.587 microseconds
: Average last node adding time   = 5.027 microseconds
: Averaged  accuracy              = 100%
: Worst case used time            = 50.5864 miliseconds
: Best case used time             = 2.6251 miliseconds


BEST RESULT: 
 - #1 Voidious Linear Search [4.6598]
 - #2 flat/linear searching [7.3546]

Benchmark running time: 8.93 seconds

--Rednaxela 02:37, 21 August 2009 (UTC)

Here are some more results. Note, the linear one here is Voidious's well-optimized one.

20 iter, 8 dim, 26k points, 40 neighbors
	Linear	SimontonKd	VoidiousKd	RednaxelaKd
Averaged used time	3.798	6.025	4.325	2.908	ms
Average adding time	3.550	4.321	4.815	3.799	us
Average last node adding time	4.710	5.926	5.957	5.238	us
Worst case used time	17.617	52.526	13.813	13.736	ms
Best case used time	2.930	0.767	1.070	0.514	ms

200 iter, 8 dim, 26k points, 40 neighbors
	Linear	SimontonKd	VoidiousKd	RednaxelaKd
Averaged used time	3.176	1.905	2.872	1.286	ms
Average adding time	3.434	4.089	4.354	3.667	us
Average last node adding time	4.722	5.809	5.806	5.221	us
Worst case used time	26.662	61.356	30.334	21.312	ms
Best case used time	2.682	0.422	1.230	0.382	ms

200 iter, 8 dim, 2600 points, 40 neighbors
	Linear	SimontonKd	VoidiousKd	RednaxelaKd
Averaged used time	0.290	0.651	0.689	0.449	ms
Average adding time	3.675	3.961	4.525	3.611	us
Average last node adding time	7.960	5.189	5.347	4.890	us
Worst case used time	9.474	19.316	32.896	10.556	ms
Best case used time	0.124	0.200	0.172	0.174	ms

2000 iter, 8 dim, 2600 points, 40 neighbors
	Linear	SimontonKd	VoidiousKd	RednaxelaKd
Averaged used time	0.152	0.303	0.255	0.237	ms
Average adding time	3.308	3.534	3.965	3.319	us
Average last node adding time	4.678	5.014	5.532	4.813	us
Worst case used time	8.801	23.066	21.881	42.481	ms
Best case used time	0.119	0.166	0.129	0.121	ms

So, this confirms Voidious's findings that Linear search can be faster than most Kd-trees at 26k points... however... shows that as the JIT kicks in when the number of test iterations increased, it's advantage at 26k points disappears. It also shows that my new tree is just so great that it's still better than the linear search at 26k points for a small number of iterations! When the number of points decreases to 2.6k however, no tree can compete with the linear search under any conditions. I find it rather interesting that the JIT kicking in has such an impact on the 26k point results :) --Rednaxela 04:55, 21 August 2009 (UTC)

I'd really like to get a TAoR tree in DrussGT to see if it fixes the skipping turns problem, are you planning to release yours Rednaxela, or should I get to work on writing my own? =) --Skilgannon 11:30, 21 August 2009 (UTC)

Yep, I do plan to release it. I just need to fix some major issues first (it breaks with too many points at the exact same location), and then test with data distributions that are more realistic. It'll probably be dual-liscenced under CC by attribution and RWPCL. =) --Rednaxela 12:45, 21 August 2009 (UTC)

Why CC-BY? I think RWPCL already covered what CC-BY do. » Nat | Talk » 14:05, 21 August 2009 (UTC)

Because I trust CC-BY to be more thorough while being quite liberal, while also dual-licensing with RWPCL to allow use in RWPCL bots. I'm thinking now though that it would be simpler to use the "Simplified BSD Licence" because it's a well tested and liberal licence which should be inherently compatible with RWPCL eliminating the need for dual-licensing. --Rednaxela 14:58, 21 August 2009 (UTC)

Very nice! Any hints about what kind of special sauce makes your tree so great? =) It's funny, I never really set out to build the biggest/baddest kd-tree, I just wanted a functional one that I wrote myself. I actually assumed all this time that Simonton's was probably better. But now all these comparisons are getting me riled up and wanting to optimize mine. =) --Voidious 12:31, 21 August 2009 (UTC)

Well, mine is similar to yours in how it stores the bucket in an array of fixed size, and it also keeps the data about the point seperate in a HashMap. My algorithms are implemented quite differently though, the biggest structural difference probably being that I avoid recursion when inserting points and just use a 'cursor' in a while loop to descend the tree. For the nearest neighbour search I also use a cursor in a while loop, but also with two stacks, one to track the list of parents of the cursor (I can't just store a reference to the parent in each node, because that creates a circular reference that Java may not garbage collect), and another which stores an enumeration of "NONE", "LEFTVISITED", "RIGHTVISITED", and "ALLVISITED" which marks the status of the parent nodes in the stack. Perhaps the biggest difference of all though in some ways, is that it uses fixed splits. It assumes the input data ranges from 0.0 to 1.0 and always splits the area straight in half instead of adapting to the input. For even distributions of data like the benchmark that would work well of course, however I don't know if it would hurt or help for distributions of data you see in DC guns/movement, it could go either way, which is why it is a priority to me to get realistic data to test with soon. --Rednaxela 12:45, 21 August 2009 (UTC)

Ah, cool. The even splitting certainly skews the results in your favor in this test, but it may not be a bad choice in real usage, either. The traditional "split at the median" kd-tree algorithm is optimized for having all your data up front, which we obviously do not. In fact, the best splits would probably be hand-coded on a per-attribute basis after examining data sets collected from whole Robocode matches. Hmm... --Voidious 13:47, 21 August 2009 (UTC)

Yep, I'm really curious to see how it will compare in real usage. I believe it's quite possible that the median split is suboptimal with the way that data is added sequentially in robocode usage, particularly if using tickwaves, since it's highly likely that the first 20 data points will be clustered in a small area instead of being representative of the overall distribution. I believe it may be sufficiently suboptimal that simple cutting in half may be better. One thing I'm considering though as a compromise, would be using adaptive selection of which dimension to split, but always split in half, that might give quite good results. --Rednaxela 14:58, 21 August 2009 (UTC)

Oh, and by the way Voidious, do you mind if I add some hooks to Diamond to output DC data in CSV format in order to get test data for this? :) --Rednaxela 17:26, 21 August 2009 (UTC)

[View source↑]
[History↑]

You cannot post new threads to this discussion page because it has been protected from new threads, or you do not currently have permission to edit.

Cache effects on benchmark

I was thinking that running *just* the benchmark vs a single tree at a time would result in the KD-Tree code and data being cached quite a bit better than would be realistic for real-life situations. Perhaps it would make more sense to run all of the trees at the same time, giving each tree the new search/data one tree after the next. This would simulate the cache thrashing between turns that happens when you are running multiple robots at the same time, because the trees would be competing with each other for cache space.

Thoughts?

The reason I ask is that I designed/wrote a tree to deal with the cache problem. It outperforms Rednaxela Gen2 on large (2mil points, 12 dim) random datasets by ~2X, but ties on smaller (30k point, 12 dim) datasets. I think the fact that for the small dataset the entire thing is in cache might be causing the difference.

Skilgannon (talk)‎

Maybe using a reference bot would make benchmarking more meaninful?

Pick one bot, put each tree inside it, one at a time, and run it against a 1v1 test bed.

MN (talk)‎

For the best comparison, absolutely. However, it might be difficult to set up the battles so that every one is the same, particularly if the trees are non-deterministic due to things like points being equal. Also, it adds a lot of overhead which would make testing very slow.

Skilgannon (talk)‎

It would make testing slow, but the overhead is what will make benchmarks meaningful. Remove the overhead and you also remove cache thrashing.

Running a test bed much like we run challenges would do. Instead of making every battle the same, run multiple random battles and measure average run time.

MN (talk)‎

What I worry is that many different trees would have to be tested in exactly the same way. Unlike scores, times are different on different computers. Perhaps if we put them all in parallel in the same robot, then see how much time they take?

Skilgannon (talk)‎

Running them all at the same time could make sense, but I would suggest being careful if you do that, because the order that they get run in may matter. Even when running them one at a time in sequence, I've recall noticing that the order in which they are run could very slightly impact the apparent performance, I suspect due to caching, JIT, and/or garbage collection characteristics. It's been a while, but IIRC the System.gc() call I have in there between running different trees was to lessen that effect. It may make sense to add some form of randomization to sequence they're run in.

Cache performance is one of those things that's tricky with Robocode, because your robot is also sharing the CPU with another bot which could be doing who knows that with it's memory accesses. For that reason I wouldn't trust optimizations for better caching behavior to necessarily pan out in practice with bots. I may be wrong about that though.

One could put it in a reference bot yeah, though the tests would be far slower and less consistent, thus requiring a much greater number of test iterations to have a reliable result.

Oh, and in any case, nice job with making a tree that much faster with the large datasets :)

Rednaxela (talk)‎

I wrote a quick benchmark and threw it in my main method. Some tweaks and improvements (putting the distance functions in methods, using a binary search for the results array) have given me big improvements at lower tree sizes:

Config:
No JIT Warmup
Tested on random data.
Training and testing points shared across iterations.
Searches interleaved.
Num points:     20000
Num searches:   200
Dimensions:     12
Num Neighbours: 40

Accuracy: 100%
Iteration:      1/3
This tree add avg:  3092 ns
Reds tree add avg:  2216 ns
This tree knn avg:  809965 ns
Reds tree knn avg:  1380366 ns
This tree knn max:  10844097 ns
Reds tree knn max:  11005183 ns

Accuracy: 100%
Iteration:      2/3
This tree add avg:  1259 ns
Reds tree add avg:  846 ns
This tree knn avg:  643037 ns
Reds tree knn avg:  1119268 ns
This tree knn max:  979013 ns
Reds tree knn max:  1787566 ns

Accuracy: 100%
Iteration:      3/3
This tree add avg:  1146 ns
Reds tree add avg:  800 ns
This tree knn avg:  641163 ns
Reds tree knn avg:  1099657 ns
This tree knn max:  1318587 ns
Reds tree knn max:  1782212 ns

Note, I hacked the RedGen2 tree to not check for NaN in the distance functions so that they are on equal footing.

Skilgannon (talk)‎

Ahh nice. Out of curiosity, any reason you're comparing against my 2nd gen tree? My 3rd gen tree was a bit faster at least in the tests I did.

Rednaxela (talk)‎

I don't have it installed yet, and I thought I'll do a proper bench when I add my tree to the whole bench framework. ~~Do you still have the code that does those nice charts tracking the trees through time? Is that what's in the KNN.jar?~~ Yes it is! Great.

Skilgannon (talk)‎

Ah, looks like KNN.jar that's uploaded on the wiki isn't quite 100% up-to-date. A couple days after the KNN.jar uploaded to the wiki was last updated I made a few minor changes. See here for the latest source [1]

Rednaxela (talk)‎

Show 3 replies

Talk:K-NN algorithm benchmark

Contents

Cache effects on benchmark

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools