User talk:Rednaxela/kD-Tree
So this mean I can use it int close-sourced robot? » Nat | Talk » 13:24, 25 August 2009 (UTC)
- Yep, so long as you don't misrepresent it's origin (i.e. claim you wrote the kd-tree in use) :) --Rednaxela 13:29, 25 August 2009 (UTC)
- You used to say that it will be licensed under RWPCL and CC-BY, but I'm more happy with this though, for that I can release the code under more tricky way than usual in-jar method =) » Nat | Talk » 14:25, 25 August 2009 (UTC)
- I decided I wanted to avoid as much complication as possible, and I found the zlib licence. Since I don't mind how people use it so long as they don't make false claims to have written it, and it doesn't conflict with the RWPCL (i.e. can be used in RWPCL bots), it seemed like a good match. I removed one clause that I deemed unnecessary. It seems reasonable to me :) --Rednaxela 14:46, 25 August 2009 (UTC)
I just tried this tree, and I'm getting problems where every now and again an Entry will have a null value in it. I've got workaround code for it, but I know it shouldn't be doing that, and I'm not sure why it is. Also, DrussGT is still skipping a LOT of turns with this tree, so I'll have to find a dimension or 3 to cull... I could even run 2 trees with 8 dimensions in less time than my 1 tree with 11. Oh yes, I changed the 2 distance functions to be Manhattan distance, but that shouldn't affect things too much. --Skilgannon 18:21, 26 August 2009 (UTC)
- Huh... that seems odd. I don't see why null entries could ever happen, I'll look into it when I get home. As far as DrussGT skipping a lot of turns still, I assume you don't mean any turns than normal do you? And yeah, even a few dimensions will make a huge difference with any kd-tree. --Rednaxela 18:33, 26 August 2009 (UTC)
- Yeah, I know it passed all the benchmarks etc. so I'm not sure what's going on. Maybe look in DrussGT 1.3.10wilo for the version I'm using if you want to test it. Lines 614 to 618 of DrussGunDC.java are where my workaround is (basically just removing all Entries with null values from the cluster). If I take that out it starts throwing errors every few rounds. (Note: just to prevent any confusion, due to me adding the first workaround - just skipping null values in the loop - it now throws the errors down on line 663 when it sorts the Indices instead of in the loop when trying to access the data inside the null value). Thanks for any light you can shed on this =) --Skilgannon 18:55, 26 August 2009 (UTC)
- Aha! I haven't tested it yet, but this should fix the nulls. :) --Rednaxela 19:23, 26 August 2009 (UTC)
- Oh, and also, this bug may do more than cause nulls: It may cause duplicate entries in some cases I suspect, so it may affect the results of jk.mega.DrussGT 1.3.10wilo. I don't think it wouldn't affect the results of RougeDC willow though because I'm pretty sure that bug was introduced after. --Rednaxela 19:29, 26 August 2009 (UTC)
I believe I have found the following problem, Arrays.copyOf is only supported since Java 6.0, so your code can't be compiled/run under earlier versions of Java. --Positive 22:49, 27 August 2009 (UTC)
- Haha, Fnl also noticed that earlier today. Fixed and tested to have no/negligable affect on performance :) --Rednaxela 00:10, 28 August 2009 (UTC)
Hey Rednaxela, good job! One thing that you might improve the speed on (if I understand your tree correctly) is the sqrPointRegionDist function. You don't need to recalculate all distances between bounds and the point, only between the *changed* bound since last check. If that makes sense. :) Also, I think it would be nice to have another version of the tree which doesn't apply weighing while searching. In any case, it's a very nice piece of code, and I'm going to try to use it in Portia (thanks to your license!). :) --Positive 21:45, 27 August 2009 (UTC)
- Hmm... you're right, that is an optimization that could be done, that is, if the overhead of caching it doesn't outweigh the benefit. I think in order to implement that I'd need to either store temporary values in the nodes, or maintain a third stack that remembers the distance of it's bounds. I'll give it a try and keep that change if it's worth it, haha. As far as a version that doesn't apply weighting, I've considered that however decided it was simpler to maintain a single version. If you want to remove the weighting capabilities, all it takes is 1) Removing the 'weights' variables where they are declared, 2) Removing the reference to them in the distance functions, and 3) Remove the couple statements that set it. Though, I suppose that if changes to the tree cease to happen for a long time I'll post the version with weighting removed as well, I just don't want to be worrying about them getting out of sync while it's in (semi-)active development. Anyways, I'm glad it's liked :) --Rednaxela 00:10, 28 August 2009 (UTC)
- Okay, that makes sense. I have some other feature requests: A size() function to keep track of how many entries there are in the tree, and it'd be cool if there was something to make sure the size stays below a certain point. A possible problem might be the line with tree.weights = weights;, because it doesn't reset the weights afterwards. Also, I'd selfishly like the addPoint functions and such to be nonstatic and without the KdTree<T> tree parameter (but I suppose that is a matter of taste). :P --Positive 01:03, 28 August 2009 (UTC)
- Alright, I gave that optimization of calculating the node distance as-needed from the splits a try. The result is that: 1) If replace the existing calculating of the distance to the tight boundary, the performance is worse, and 2) If I just use it to shortcut past the full distance to tight boundary calculation, the performance is about the same. This indicates to me that 1) Calculating the exact tight-boundary disance for nodes eliminates a rather significant number of node visits, and 2) The overhead of tracking the data for split-based node distance is large enough that it's not worth it to use it as a shortcut. --Rednaxela 06:23, 1 September 2009 (UTC)
- Keeping the weights across calls I considered desired, but it was a bad API for it. It's now split into a setWeights() call. Haha, I don't know what I was thinking when I made the addPoint and nearestNeighbours functions static, I was in a strange mindset when I first wrote this beast. They're now non-static and tested to have no impact on performance. I'm not sure what you mean by "make sure size stays below a certain point". Do you mean removing old points? --Rednaxela 03:39, 28 August 2009 (UTC)
- Great. :) Yes, that's exactly what I meant. It seems usefull and safe to have some kind of deletePoint function and perhaps a built-in linked list system to remove old entries, so that the data stays up to date & won't eventually fill up the memory. --Positive 03:48, 28 August 2009 (UTC)
- Well... support for an optional limit on size is now supported. Deleting arbitrary points however, is not supported, because that would conflict with the size limit since it would be far too painfully slow to remove that point from the linked list that tracks the the order that points were added in. Also, to make sure this doesn't impact speed normally, that list is never even created if no limit was specified. Anyways, the size of the code is getting closer to 500 lines than I'd like it to, so I think this is enough non-optimization features for any sane usage of it. --Rednaxela 04:45, 28 August 2009 (UTC)
I don't know if this is standard or not, but it seems strange that the constructors isn't the very first method in the class. And the method/sub-class ordering is more like... DrussGT's =) They are now ordering by the time you added the code and where, aren't they? I know it isn't going to improve the execution speed, but... » Nat | Talk » 14:40, 28 August 2009 (UTC)
- Nah, not ordered by the time added at all. At one point I had the functions for 'bounds' grouped beside the variables, but they kind of drifted apart. There wasn't any real reason for the ordering except that nearby things often had some relevance to eachother. Reformatted that a bit now as to be saner and maintain that relevance-ordering :) --Rednaxela 14:56, 28 August 2009 (UTC)
I've been testing with the pre-limit version of your tree in Portia, and it seems to be working. :) Now with your new version, I do get an error at line double[] location = this.locationStack.pop();: (The method pop() is undefined for the type LinkedList<double[]>). --Positive 15:53, 28 August 2009 (UTC)
- Ugh... Any reason why you're on ancient Java 5? Anyway, that method in LinkedList only exists in 6 not 5. Making it Java 5 compatible again... --Rednaxela 16:11, 28 August 2009 (UTC)
- Haha, actually I'm on Java 6. I compile everything using the Java 5 library though, so I know for sure I don't make code that's not Java 5 compatible. :) --Positive 16:14, 28 August 2009 (UTC)
I didn't exactly had time to look at this, but just out of curiosity, does this tree return the scans sorted by distance? By the way, congratulations on your optimization work.--Navajo 00:43, 1 September 2009 (UTC)
- It returns the entries in max-heap array ordering, since that is the most efficent way to construct the list of nearest neighbours. Maintaining a list of "lowest n values" doesn't actually require the list to be fully ordered, it only requires enough ordering that it's easy to throw out the largest value when over the size limit, making a max-heap the perfect structure for the task. It would be fairly simple and fast to convert this to squential order for final output, but I saw no reason to bother with it since I've never seen a DC gun/movement care about the order of the outputs, only the distances associated with them. --Rednaxela 01:25, 1 September 2009 (UTC)
- My DC Gun does care about the order of the output. It computes pif angles discarding the ones that end outside the battlefield, so I need to know that I end up with the closest entries possible. --ABC 10:15, 1 September 2009 (UTC)
- Err... I'm not sure I understand... Any sane DC-PIF gun would be discarding result that end outside the battlefield like you say, but it's not as if the order of output affects which ones end up inside/outside the battlefield. The outputted values are ensured the closest entries possible ones, just not in sequential order. I don't see how any gun could sanely care about the output order, but if so, which order would you prefer, ascending or descending? --Rednaxela 12:39, 1 September 2009 (UTC)
- I'm speculating, but maybe he builds a cluster of size M and then tries to keep the best N<M that stay inside the field. But I think that a sorted cluster should be optional as it may impact the performance for most of the tree users with no gain. --zyx 13:24, 1 September 2009 (UTC)
- Yes, that's exactly what I do. If it's too much trouble I can always sort them myself. --ABC 14:09, 1 September 2009 (UTC)
- Nah, it's not too much trouble, and as I explain to Nat below, it's quicker for me to get sorted output from the heap than it is for you to sort the data after. I already have this implemented on my computer at home, all I have to do is test how it impacts performance, and if it's measurable, make it optional. --Rednaxela 14:52, 1 September 2009 (UTC)
- Hey ABC, unlike what you said... Simonton's tree may not always output in descending order always! Reading the code for Simonton's code indicates that it uses a PriorityQueue, and the order it outputs is the same as the iterator of PriorityQueue. If you read the java docs, it very clearly states "
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the PriorityQueue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).
", therefore if Simonton's tree is currently returning in descending order, then it just happens that your JVM does that. This means that the behaviour of versions of Shadow using Simonton's tree are VM-dependant. But after I make the sorting supported in my tree when I get home, Shadow using my new tree will be able to be sure it's performance is not dependant on the JVM in question. --Rednaxela 15:21, 1 September 2009 (UTC)
- Hey ABC, unlike what you said... Simonton's tree may not always output in descending order always! Reading the code for Simonton's code indicates that it uses a PriorityQueue, and the order it outputs is the same as the iterator of PriorityQueue. If you read the java docs, it very clearly states "
- Thanks for that info! You are a true Java Guru. At the time I just blindly plugged Simonton's tree instead of my brute force method, noticed the entries were "backwards" and adjusted my code accordingly... --ABC 15:53, 1 September 2009 (UTC)
- I don't understand, how the order affect the result of PIF algorithm? On the other hand, returning the sorted data would take O(n log n) time more due the heap, and that definitely effect the bot that doesn't care about the ordering. If we sort them ourselves, it would still take O(n log n). I see no point to add the ordering to the tree, unless, of course, it won't effect the speed. » Nat | Talk » 14:23, 1 September 2009 (UTC)
- I want to find the 10 scans closest to the current one that end inside the battlefield. I ask the tree for the 20 closest scans, and then calculate the path for each of them, starting from the closest one, until I get 10 valid ones. --ABC 14:39, 1 September 2009 (UTC)
- (edit conflict, crossing out redundant things)
It was like Zyx said. The numbers are made up but it's like this: Shadow gets the 50 closest entries from the kD-Tree, it then runs PIF on each entry starting with the one with the last distance, and it stops processing once it has the PIF results of 10 that didn't run into walls. Again, 50 and 10 are made up numbers for sake of example.Also Nat, you're very wrong about the speed there. Firstly, the time taken by reordering the n final results, is MUCH quicker than the nearest-n-neighbours search itself, because O(n log n) only tells you how the speed is multiplied by changing the number of values sorted, it doesn't tell you the base speed, which happens to be far quicker than the base speed of the nearest-n-neighbours algorithm. Secondly, I can make the output results sorted in notably LESS than O(n log n) time. Converting max-heap ordering to sequential ordering, only requires the second part of a heapsort, rather than a full run of a sorting algorithm. --Rednaxela 14:52, 1 September 2009 (UTC)
- (edit conflict, crossing out redundant things)
- I just checked and your made up numbers (50/10) are exactly what I currently use in Shadow's melee gun. :) --ABC 15:53, 1 September 2009 (UTC)
- Oh wow, my guessing skills are quite good it seems, haha. As far as speed of that goes Nat, I'm pretty sure that in most DC-PIF guns, the PIF takes considerably longer than the DC. --Rednaxela 16:08, 1 September 2009 (UTC)
- Merge sort, which is default for Arrays.sort() is twice as fast as the heap sort according to my book on algorithms, which has the benchmarking time printed. And the second part of heap sort is still O(n log n), you still need to do down-heapify for n times. Unless there are faster ways to do it. » Nat | Talk » 15:13, 1 September 2009 (UTC)
- True, the second part of heap sort still scales at a rate of O(n log n) but should be considerably faster than a full heapsort. I may benchmark the difference later, but I'm extremely doubtful that Arrays.sort() could be faster for data that is already in heap ordering. Plus Nat, one very important consideration that you leave out, is that in order to use Arrays.sort(), unless you're sorting a simple array of numbers, your input values need to be encapsulated in objects which implement Comparable, and comparisons of Comparable are well over 5x slower than direct numerical comparison due to function call overhead. Avoiding overhead such as that is part of why my tree is just so damn fast ;) --Rednaxela 15:31, 1 September 2009 (UTC)
I'm not sure where in this page I should post this since there are so many topics here, but anyway, while trying to improve Gibbs I found two situations in which your tree throws exceptions. First is when searching for 0 neighbors. This one I easily solved by adding a simple if(count < 1) return new ArrayList<Entry<T>>(0);
at the beginning of the nearestNeighbor method. The second problem is when you limit the tree size to something smaller than bucketSize. When the method removeOld() is called it throws a NullPointerEception when checking the condition of the following loop
do {
cursor.locationCount--;
cursor = cursor.parent;
} while (cursor.parent != null);
which is easily solved by changing it to
while (cursor != null){
cursor.locationCount--;
cursor = cursor.parent;
}
I know these problems are quite specific and easy to solve, but I just felt like I should report them. Also, if it is not much of a trouble, could you please add an option to remove the closest scan instead of the oldest? --Navajo 01:15, 13 April 2010 (UTC)
Well, the version currently posted on this page I kind of consider deprecated in favor of my new version. The new version is slightly faster, is more flexibly coded, happens to behave right for the zero-sized search, and adds an additional feature of providing an iterator for allowing incremental search (but still best to specify an explicit max search size for performance reasons). On the other hand, the new tree doesn't yet have this support for removing old/close data. I could add this to it shortly if there is demand though. Would you want to use that Navajo? :) --Rednaxela 01:59, 13 April 2010 (UTC)
I'm planning to test the impact of removing the closest instead of the oldest scan, but I can add this feature to the tree myself if it is too much of a trouble to you. --Navajo 03:12, 13 April 2010 (UTC)
Contents
Warning! Rounding Errors!
Argh! It seems rounding errors are evil! Evil evil evil! I was toying with my tree in a very lightly segmented gun where exact-same-locations will occur frequently and it did that stupid rampent/infinite branching again. The cause? It turns out to be rounding, indicated by this excerpt from some debugging messages.
... Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 Split Dimension: 1, Split Value: 1.0, Range: 0.9999999999999999 to 1.0, Width: -1.1102230246251565E-16 ...
When the only values in the dimension were 1.0, and just BARELY below it, the splitValue would be set to the average of those two values. Due to rounding errors, the average happened to be the same as the higher value... causing the tree to lump all values into the left child node and try branching again, and again, and again :(
So as a warning to those currently using the tree, this obscure sitaution could cause the tree to lock up... I'll release a fix soon --Rednaxela 00:54, 30 August 2009 (UTC)
- Simonton's fix to this was to only allow his tree a certain recursive depth (he used 500), after which it would start throwing away values. This way it was also (sort of) possible to have a tree with 'rolling averages' by setting the max recursive depth very low. However, it's more of a bandaid then a fix for the root of the actual problem. --Skilgannon 13:03, 30 August 2009 (UTC)
- For this particular issue, that would indeed be a very very poor bandaid, and anyways I have a proper fix in place now. As for having a limit on tree size, I very strongly prefer the limit on the number of entries I have implemented. While it won't put a hard limit on tree depth or nodes it will generally to keep it within reasonable bounds and will always toss out old data in the exact order it was entered in. --Rednaxela 16:07, 30 August 2009 (UTC)
Well, this sucks, I thought I could fix it simply by making a > into a >= because I thought it always would round one way.... turns out it doesn't... it rounds whatever way it's in the mood for it seems.. bah. Oh and by the way ABC, curse Shadow for hitting such almost-1.0 velocitities as to cause rounding issues :P --Rednaxela 02:22, 30 August 2009 (UTC)
I'm sure there are lots of ways to deal with this, but just a comment on how my insert method handles a similar situation. Values less than the split go to the left, values greater go to the right, and values equal to the split go to alternating sides. One other implication is that when removing a node, you have to check both sides when the value is equal to the split value, but that's easy enough. --Voidious 02:33, 30 August 2009 (UTC)
- I guess if you had 19 0.9999's and one 1.0 and it used 1.0 as the split, you'd still have this problem unless the first alternating side also alternated. So I suppose a different special case would probably work a lot better. Another thing that sucks about this, at least if you use the tree the way I do... Rounding every number to, say, 10 digits seems pretty reasonable, but I wouldn't want to do that either, because I count on the fact that the neighbors returned from the tree are the same exact objects I inserted. :-/ --Voidious 02:43, 30 August 2009 (UTC)
(edit conflict) I saw that alternating approach, but I really don't like that approach for several reasons. It increases the balance of the tree very slightly, but it comes at the cost of making each node wider than they need to be. Not only deletes have to search more nodes Voidious, but seraches as well. For example, presume a dimension only has values of 0.0, 0.5, and 1.0 and the first split in the tree is at 0.5. Then the full search has to be repeated completely on both sides, and thus could have a huge impact if such a split happens to be the first split in the whole tree. Anyways, I now have a fix, the solution is: If split value is equal to the maximum node value (will only occur if they're so close that no Double value between could exist), then it sets the split value to the minimum node value instead which guarantees some values will be put in each node. Extensive testing is showing this is working quite nicely :) --Rednaxela 02:48, 30 August 2009 (UTC)
- Ah, cool. I agree it can unbalance the tree, but I don't think it makes searches any slower (otherwise). Mine definitely doesn't account for the equal-to-split edge case in its "findLeaf" - I recall thinking it would need to account for that, then realizing it didn't.
- When coming back up the tree, you test if the other side of the tree could have a node with a distance to the search point that is lower than some threshold. Since the value could be infinitely close to the split value, you have to just assume it is the split value, right? So while you may take a wrong turn while descending the tree if you hit values equal to the split value, you'll check the other side on the way back up, which you're just as likely to have had to do anyway. In your example, you'd need to search the other side anyway - know what I mean? It's been a while since I wrote my tree, maybe I'm misunderstanding something... =) But I'm pretty sure it's just as accurate and just as fast for searches.
- Deletes need to find an actual value that's already in the tree, though, which is a different scenario. (And I don't particularly care how fast deletes are.) --Voidious 03:11, 30 August 2009 (UTC)
- Which reminds me, I actually wrote this case for another scenario, which might be a problem in your current solution, too: if all the nodes have the same value for the split dimension. --Voidious 03:19, 30 August 2009 (UTC)
- I presume you mean all entries, not all nodes. In that case yes, I already deal with that case. I always set the split dimension to the widest dimension, and whenever "widest" is still 0 width, then it will simply double the bucket size for that particular node. My logic is that the scenario you describe could only happen with "split on widest dimension" when that all entries are in exactly the same spot, in which case there could never ever be any performance gain by putting them in seperate nodes. On a related note, one possible optimization I could also do with this, would be giving such nodes a special marking and avoiding redundent distance calculations in them. I haven't bothered with that yet though since I doubt it would occur often in practice with a well-segmented gun. --Rednaxela 04:51, 30 August 2009 (UTC)
- Actually Voidious, in my tree checking the other side anyway wouldn't happen. I keep track of an extra-tight bounding box for each node and compare to that, and due to how my splitting doesn't alternate, the tight bounding box will never overlap on the split value actually. So.. you're right about some trees, but not my one :) --Rednaxela 04:51, 30 August 2009 (UTC)
- Which reminds me, I actually wrote this case for another scenario, which might be a problem in your current solution, too: if all the nodes have the same value for the split dimension. --Voidious 03:19, 30 August 2009 (UTC)
Some major changes
Man, this tree keeps taking away from time to finish my melee gun, but anyways: I've done a major update now which gives the tree a bit nicer and more flexible of an API. In particular, you now construct it with things such as new KdTree.SqrElucid<T>(dimensions, sizeLimit)
or new KdTree.WeightedManhattan<T>(dimensions, sizeLimit)
instead of new KdTree<T>(dimensions, sizeLimit)
and modifying the distance function in your copy of the code (I'm looking at you Skilgannon :)). This also makes it easy to use the unweighted version which is faster by 5-10%. One interesting note I found, is that while Manhattan distance is less complicated to calculate, it makes it harder for the tree to eliminate branches and thus is twice as slow (Looking at you again Skilgannon, considering your recent concerns with DrussGT's speed). What do people think? I'm also considering making a stripped down 'lite' version with no weighting, only sqrElucid distance, some of the more agressive and less significant optimizations removed, no size limiting, and be nearly as fast. --Rednaxela 05:52, 2 September 2009 (UTC)
Hmm, so using Manhattan approximately doubles the number of branches that must be recursed? Is this dimension-dependent, ie., would the fact that I'm using 11 dimensions make this a factor of more than 2? I'd gladly switch to using Euclidean distance to get this speed increase, but from what I've found, my gun scores significantly better using Manhattan distance. The difference between the two is about 0.5 worth of TCRM score. I'm guessing that how my dimensions are defined somehow favors the use of Manhattan over Euclidean, but I don't currently have the time to tweak my segments into a form that works well for Euclidean instead =) Another point, for bots that only use one weighting scheme throughout the match, isn't it quicker to multiply the point by the weighting before storing it, and then afterwards treat the dimensions as unweighted? This is what I've been doing in DrussGT since way back when... it also means that I can apply non-linear weightings without any performance hit at runtime. --Skilgannon 08:50, 2 September 2009 (UTC)
- I didn't run an exact count of the number of branches that must be recursed, but since the time taken is doubled, I feel pretty sure the number of branches recursed is at least doubled. It being possibly dimension-dependant might indeed be the case but I don't think 11 dimensions would make it more than my test showed, since my test was with the 13-dimension data recorded from Diamond. About weighting: yes, of course it's quicker to multiply the point by weighting it before storing it. The weighted versions of the tree are only intended for bots which do dynamic weighting, like Diamond does (melee weighting vs 1v1 weighting). Static weighting is considered to be something that bots are responsible for doing when they create their dimensions in the first place. --Rednaxela 12:42, 2 September 2009 (UTC)
I haven't tested it yet, but I would like to say, fantastic job! This definitely gives "everyone what he/she wants". I only wonder, don't abstract classes make the tree that much slower (because of the overhead in having to decide which distance() function to use for example)? I'm not a java expert, so it might be that that is optimized at runtime. --Positive 14:29, 2 September 2009 (UTC)
- Thanks! I was unsure about that myself at first, but I tried it, gave a it a good performance test, and it performed exactly the same it seemed, at least in Java 6 (I don't know for certain but old enough VMs I bet wouldn't perform equally with them). In fact, the slight tweak of having the distance functions read 'weights' from the class instead of passing it in as an argument seemed to cause this revision to have a very slight improvement in performance. --Rednaxela 14:44, 2 September 2009 (UTC)
- Great, you've really done your homework. :) I just looked over the code, and one little thingie: It seems to me you could use a regular "int" for sizelimit instead of Integer. (You can check for sizeLimit==0 in instead of sizeLimit==null). Might give a small improvement. :P --Positive 14:58, 2 September 2009 (UTC)
- Hmm, true I could do that. That wouldn't have much of a performance impact though I'd think, considering how
== 0
and== null
would be the same speed and the numerical comparison when actually checking the value only happens once per insertion call. I might make the change anyway though, considering the way I'm not really fond of the wrapper classes like Integer to begin with. --Rednaxela 15:09, 2 September 2009 (UTC)
- Hmm, true I could do that. That wouldn't have much of a performance impact though I'd think, considering how
I know I shouldn't burden you with any more feature requests, but I'd really like the following extension to your tree (because it is a lot less elegant to do outside of the class): As an addition to nearestNeighbor, I'd like to give an argument like (double)acceptedLowest[],(double)acceptedHighest[], that prevents the tree from returning any matches with a dimension value outside of the specified range. (That feature would be awesome in a new sologun I'm trying to make for Portia!) --Positive 19:45, 2 September 2009 (UTC)
- Hmm... well... that would be easy enough to add, except for one issue: The proper/efficient way to implement it would require a third distance function: Longest distance from a point to the inside of a region. I think the subclasses that implement the distance calculations are rather bloated/repetitive as it is really, without having to add that as well. So yeah, feel free to modify it for your usage, but adding that particular feature to the main version just adds too much bulk I think. --Rednaxela 00:24, 3 September 2009 (UTC)
- Well... I gave it a good try but couldn't work the bugs out of it, nor were perliminary performance indications looking promising. --Rednaxela 15:37, 5 September 2009 (UTC)
Spelling
Isn't 'Elucidian' supposed to be 'Euclidean'? » Nat | Talk » 14:46, 5 September 2009 (UTC)
- Yep, you're right :) --Rednaxela 15:37, 5 September 2009 (UTC)
help with the Java
Hi Rednaxela, I'm interested in learning your kdTree and playing with the possible applications. I was wondering if you or someOne might point me in the direction of an open source bot that uses your Tree... robocode is my only experience programing , a proper example of it implemented would greatly help and make my learning curve alot more fun. Thx --Jlm0924 20:04, 17 September 2009 (UTC)
Skilgannon uses it in DrussGT, which is open source. --Voidious 22:06, 17 September 2009 (UTC)
- I wouldn't recomend RougeDC willow as a reference for usage of this kD-Tree, as it still used an old (and subtly buggy) version. DrussGT would be a good example indeed. When I get around to releasing Glacier that will also be an example. --Rednaxela 23:02, 17 September 2009 (UTC)
- Yeah, take a look at DrussGT. I use it in two different places, both the gun and the movement. In the gun it is used to find similar situations for finding an angle to shoot at. In the movement it is used to guess the bullet power the enemy will use to fire. Once you've extracted the source look at jk/mega/dgun/DrussGunDC.java and search for 'heapTree'. It should be fairly obvious where I create the tree, where I'm adding the data to the tree and where I retrieve the data from the tree based on how close it is to the current scan. Perhaps it would even be good to write a simple gun showing how the various trees are used, as a reference. --Skilgannon 09:08, 18 September 2009 (UTC)
- I think having a simple reference implementation is a really good idea. Though if by "various trees" you mean including those besides Rednaxela's, right now I'm not sure why anyone would choose anything else. =) I'm already debating optimizing mine or just switching to his. --Voidious 13:02, 18 September 2009 (UTC)
- I'm taking a look now... a couple of reference implementations in a simple bot would be a gold mine though ! DrussGt is overWhelming to me :) I get some sleep and take a better look :) Thx guys --Jlm0924 16:00, 18 September 2009 (UTC)
sequentialSorting
Hey Rednaxela, still an awesome tree, but perhaps you could add a little note to the tree about how the sequentialSorting works... I only just found out it sorts highest distance first. :P --Positive 10:36, 24 September 2009 (UTC)
Yeah, it might be good to make a note of that somewhere. It was done that way to both 1) Be consistant with how ABC (falsely) throught Simonton's one was working, and 2) That's the order that values can naturally be extracted from the heap)
- Okay, got it, no problem. I've done a little alteration to your code to turn the order around, all fixed. :) --Positive 16:34, 24 September 2009 (UTC)
Range Search
Can I ask that you add this as well. This is part of some standard KD-Tree implementations in other languages. You can probably read over how it works in my tree, but adding support for it shouldn't cause problems. It can be used in say, older style pattern matchers for getting matches to the current enemy state within a hyper-rectangular bounds (say in this case velocity and heading delta). --Chase 02:49, 3 March 2010 (UTC)
Sure, wouldn't be hard to add in my rewrite. It might be a while till I get around to that though, since I'm really anxious to try my (possibly completely novel?) ideas for making it faster and such. --Rednaxela 03:27, 3 March 2010 (UTC)
- Actually I had tested some of those on my tree the other day while trying to optimize it, the "choosing a better dimension if the current one is only one value" actually produced decent gains assuming you chose a good dimension. In my first one I chose the one with the biggest difference, but I didn't like it, even though it worked better than my "iterate through dimensions in order till you find one that has more then one value", my guess is the more values a dimension has, the better (but that is hard to track). None of the data I posted is from those, but it got around 0.3 to the listed maps 0.4. --Chase 03:55, 3 March 2010 (UTC)
- Are you talking about the dimension to split on? That's very different than the things I have in my plans section. As far as what dimension to split on and where to split in it, I already do "middle of the dimension with the widest variance" (which just today I noticed some research papers conjecturing that it is perhaps the optimal kd-tree splitting method). --Rednaxela 04:06, 3 March 2010 (UTC)
- Oh, alright, thats what you meant when you say that. Hehe, I guess I am more than a bit out of date. --Chase 04:40, 3 March 2010 (UTC)
- Yeah, what I meant with the "dimension-pruned" things, is that the calculation of distance between the search point and each point in a bucket, can be made to not repeat the part of the summation for dimensions where the node has a width of 0, instead calculating that part of the summation once for every point in the bucket. --Rednaxela 05:12, 3 March 2010 (UTC)
- Oh, alright, thats what you meant when you say that. Hehe, I guess I am more than a bit out of date. --Chase 04:40, 3 March 2010 (UTC)
- Are you talking about the dimension to split on? That's very different than the things I have in my plans section. As far as what dimension to split on and where to split in it, I already do "middle of the dimension with the widest variance" (which just today I noticed some research papers conjecturing that it is perhaps the optimal kd-tree splitting method). --Rednaxela 04:06, 3 March 2010 (UTC)
Rewrite progress
Well, I have my rewrite largely done now. So, to those who thought it would be neat to have a search iterator... I now have it! At no significant performance penalty in fact, though it does operate best when you give it a a 'max iteration length', but if you stop before that, you still gain some time be it never descends into nodes until they have a possibility be required for next(). The reason it's still good to provide a 'max iteration length' is so that can routinely prune it's list of points it has evaluated. In order to make this possible efficiently, I coded up a "Interval Heap" double-ended queue that is very fast. Results look encouraging so far, however I didn't see the gain I hopes to see from it's more flexible path selection code. See the following results:
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (Unsorted Output) >> : Average searching time = 0.059 miliseconds : Average worst searching time = 0.899 miliseconds : Average adding time = 7.15 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Red's "Next Gen" kd-tree (Sorted Output) >> : Average searching time = 0.061 miliseconds : Average worst searching time = 0.936 miliseconds : Average adding time = 7.47 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (Sorted Output) >> : Average searching time = 0.061 miliseconds : Average worst searching time = 1.086 miliseconds : Average adding time = 7.15 microseconds : Accuracy = 100%
It's a pretty close race despite the code structure and details of the search algorithm being rather different. Sets a new record for fastest tree with sorted output anyway, and I should be able to improve the results further... --Rednaxela 07:25, 4 March 2010 (UTC)
- I take it the output from the iterator is sorted, so I assume you may have to search additional buckets to find the next nearest without completely exhausting the current one, how can you do that without much of a performance hit, do you just make the list internally and supply the values as needed and then grab additional batches as needed? --Chase 10:15, 4 March 2010 (UTC)
- Pretty much. It's procedure is mostly like follows:
- Could any nodes in the pendingPaths heap have points closer to the search point than the closest point in the evaluatedPoints heap? If yes, do the following. Loop it so long as this condition is true.
- Pop the smallest distance node out of pendingPaths, and descend it down the "first guess" path according to what splits lead closer to the search point
- During the descent, put the branches not taken into the pendingPaths heap, with the distance between that path's bounding box and the search point computed.
- At the bottom of the descent, iterate through all points, computing their distance to the search point
- Insert the point in the evaluatedPoints heap if either 1) The size of evaluatedPoints is less than the max number of points remaining to return, or 2) the distance is smaller than the largest distance in evaluatedPoints
- After each insertion, if the size of evaluatedPoints is greater than the max number of points remaining to return, remove the largest point from evaluatedPoints
- Pop the smallest distance node out of pendingPaths, and descend it down the "first guess" path according to what splits lead closer to the search point
- Pop the smallest distance result out of evaluatedPoints and return it.
- Could any nodes in the pendingPaths heap have points closer to the search point than the closest point in the evaluatedPoints heap? If yes, do the following. Loop it so long as this condition is true.
- Interestingly, even if I pass the iterator a 'max points to return' that is essentially unlimited, as would allow iterating the whole tree, it's merely 30% slower to get the nearest 40 points, still faster any tree other than Duyn's and my own. --Rednaxela 15:06, 4 March 2010 (UTC)
- Pretty much. It's procedure is mostly like follows:
Adding "replaceMax" and "replaceMin" methods to the interval heap helped performance a bit:
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (Unsorted) >> : Average searching time = 0.061 miliseconds : Average worst searching time = 1.281 miliseconds : Average adding time = 7.23 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (Sorted) >> : Average searching time = 0.062 miliseconds : Average worst searching time = 1.372 miliseconds : Average adding time = 7.23 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Red's "Next Gen" kd-tree (Storted) >> : Average searching time = 0.059 miliseconds : Average worst searching time = 1.119 miliseconds : Average adding time = 7.7 microseconds : Accuracy = 100% BEST RESULT: - #1 Red's "Next Gen" kd-tree (Sorted) [0.0591] - #2 Rednaxela's Bucket kd-tree (Unsorted) [0.061] - #3 Rednaxela's Bucket kd-tree (Sorted) [0.0625]
--Rednaxela 20:06, 4 March 2010 (UTC)
Interestingly, the performance improvement is more obvious on my netbook:
RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (unsorted) >> : Average searching time = 0.301 miliseconds : Average worst searching time = 19.547 miliseconds : Average adding time = 14.88 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Rednaxela's Bucket kd-tree (sorted) >> : Average searching time = 0.313 miliseconds : Average worst searching time = 19.032 miliseconds : Average adding time = 15.06 microseconds : Accuracy = 100% RESULT << k-nearest neighbours search with Red's "Next Gen" kd-tree (sorted) >> : Average searching time = 0.283 miliseconds : Average worst searching time = 18.02 miliseconds : Average adding time = 15.24 microseconds : Accuracy = 100% BEST RESULT: - #1 Red's "Next Gen" kd-tree (sorted) [0.2832] - #2 Rednaxela's Bucket kd-tree (unsorted) [0.3009] - #3 Rednaxela's Bucket kd-tree (sorted) [0.3134]
Note, both this and the above tests are done with 100 iterations, which seems to give a fair bit of accuracy. Also... it looks like the worst search times increase on my netbook by a much greater factor than the average does. I wonder what the cause of that is... --Rednaxela 20:40, 4 March 2010 (UTC)
Ouch! I just tested on a school computer, and the ratio of the average-worst-search-time to the average-search-time got over 4x worse, 19.4ms:0.05ms! I have no clue why... the only theories I have is that the computers with the slow worst times are on OpenJDK instead of normal JDK6. Whether it's the fault of javac or the jvm is unknown. I doubt it's related to CPU cache size because the netbook (Intel Atom N270) has a 512kb cache when the school computer (Intel Core2 6300) has 2048kb cache yet has a larger discrepancy. --Rednaxela 01:37, 5 March 2010 (UTC)
R-tree variants?
Has anyone taken a look at the R-tree variants like the R* tree? I'm currently thinking of trying to adapt my 3rd gen tree (currently unreleased) into an R-tree variant. That means that overlap between nodes will be allowed, it will cause smaller bounding boxes and also make it easy to make into a 3-ary or 4-ary tree instead of 2-ary tree. My current 3rd gen implementation happens to be coded in a way that would make conversion fairly easy I think... And my ideas for pending improvements (implicit subtree instead of bucket, and dimension pruning), should also be equally applicable to the overlap-allowing R-tree variants. The tighter bounding boxes at the cost of overlap, and wider branching, may well be worth it... --Rednaxela 19:28, 5 March 2010 (UTC)
Well, a quick hack to use an R-tree-like insertion rule (instead of inserting points on the 'correct' side of the split, it inserts points where it'll cause the volume of the child node to increase the least. This leads to nodes with less volume, but possible overlap (violates normal kd-tree rules). The performance change was negligible in either direction. Now I just need to replace KD-tree-style node splitting, with B-tree-style self balancing mechanics, which should give a good result I hope. After that, I plan to try some of the 'forced reinsertion' voodoo of the R*-tree. --Rednaxela 18:21, 6 March 2010 (UTC)
Well, I got a R-tree working, but unfortunately it's performance was was quite poor... Unsure if the issue is a bug or if that's just how it has to be. I tried adding 'forced reinsertion' of tree nodes in R*-tree style which seemed to help some but performance, while improved, was still poor. I might try forced reinsertion of data points as well, but I have doubts it would improve results to compete with my kd-bucket-tree. I'm also thinking of changing the node-selection algorithm from the standard "whatever increases volume the least" like R-trees normally do, into something that discourages overlap of nodes more, which may help. --Rednaxela 17:16, 10 March 2010 (UTC)
Haven't had much of a chance to work on this since the last post, but I did find that the current R-tree variant I have is hitting 25%ish percent of nodes, far too many. Going to indeed first try the forced reinsertion of points firstly to see if that helps. After that, I'll try something that discourages overlap more which... oh... just happens to be what the "[wikipedia:X-tree|X-tree]" does. It's encouraging to find that types of approaches that I think about going down, have already been found to be promising by others. --Rednaxela 18:44, 29 March 2010 (UTC)
Tried the reinsertion of leaf points as well, didn't help much. I'm now suspecting I have a bug in my implementation. The code is also getting a little messy. As such, I'm going to re-do the bulk of the R-tree code without basing it so much on the kD-Tree code this time. I need to both catch bugs and make the code cleaner before I feel comfortable trying out other changes like what an "X-Tree" implements. Progress is slow due to class getting busy. --Rednaxela 02:01, 7 April 2010 (UTC)
Compressing unused dimensions
So for an experiment, that was listed as 'Dimension-pruned distance calculations', I made a variant of my tree that compresses point data in a simple way that I expected to improve performance: In leaf nodes, the the 'points' array will omit axis whose value are the same for all points. This means some memory savings, plus it means that when adding all of those points, it calculates a 'base' distance from the the values in the 'unused' dimensions, and then calculates the additional distance for each individual point only on the dimensions necessary. It seemed that this could in theory gain some performance by skipping a couple of dimensions in each distance calculation for an individual point. Unfortunately it seems that the cost of for each node making a new 'SearchPoint' array with dimensions matching the active ones in the leaf node, outweighs the distance calculation time saved. I may be able to optimize this overhead out somewhat but I'm not so optimistic about this path anymore. The possible gains will be sure to be small with normal data sets so I'm not sure it's worth trying to squeeze the drop of performance out of it. Onto other approaches for now... --Rednaxela 17:38, 10 March 2010 (UTC)
Tree differences
I switched to your tree in my KNN classifier. Execution time for 1175 battles went from 855s to 582s, pretty sweet. (HOT takes 251, to give a baseline.) I was concerned at first that it hit 19 less shots out of 1,282,681 - not to say that isn't negligible, but out of concern that the results should be identical. But with this rounded off data, my guess is that it's probably just choosing some different points when two are identical, so I'm not gonna waste time investigating it. Anyway, good stuff. =) --Voidious 23:25, 15 March 2010 (UTC)
- Glad to see it's working well for you. Yeah, I wouldn't be surprised if that difference, one note is that if I remember correctly, the tree will (currently) prefer the older data when there are a bunch of duplicates. Just out of curiosity, which version of the tree are you using? My rewrite is slightly faster past the currently posted on the page here. --Rednaxela 06:58, 18 March 2010 (UTC)
- I'm using the old one (from this page). I didn't realize the new one was available - I have some plans for that iterator. =) Thanks. --Voidious 13:59, 18 March 2010 (UTC)
- Hey btw, in the new tree, Eclipse yells at me that your two @Override's in SquareEuclideanDistanceFunction and the 4 in NearestNeighborIterator are Errors, since those methods don't override a superclass. --Voidious 15:51, 20 March 2010 (UTC)
- In Java 6, @Override applies to interfaces as well. I've noticed Eclipse has bugs with this however and doesn't support it even when using Java 6. Anything except Eclipse likes it. Might change it I suppose though... --Rednaxela 16:12, 20 March 2010 (UTC)
3rd gen tree licence
Is you 3rd gen kd-tree found in the mercurial repo licensed under the same zlip license? I found no license notice in your third gen kd-tree source code. --Nat Pavasant 13:38, 17 April 2010 (UTC)
Hmm yeah, I forgot to deal with that. Sure, that same zlib license works. --Rednaxela 14:56, 17 April 2010 (UTC)
Translating... reimplementing... etc
In June/July I'm going to be starting my thesis, a Simultaneous Localisation And Mapping algorithm, and in order to run at a half decent speed I'm probably going to need to need some form of 3D tree to project and compare different robot poses with the current scan readings of the environment. Anyway, being in an academic environment, Matlab (embedded in Labview) is the preferred language that I would implement my algorithm in. I'm not sure if you've worked with it, but it has very fast array operations despite the actual code itself being quite slow. I was wondering what changes you would make to the general design of your Kd-tree if array operations (eg. y = sin(x)
where y and x are arrays) yielded a speedup of 50x or so versus accessing each component individually in a sort of for x_i in x: y_i = sin(x_i)
situation. Any major re-factoring? I'm going to need some sort of KNN implementation along with a cutoff threshold. Any thoughts or insights would be appreciated before I commit myself too deeply along one implementation =) Thanks --Skilgannon 18:08, 15 April 2011 (UTC)
Hmm... interesting question... I've worked with Matlab some, but have not yet had reason to use it very much, so it's not a language I'm particularly fluent in yet. As far as optimizing for that criteria...
- The simplest place to start would be translating simple things that iterate over axis, into array operations. It should be trivial to do so for the distance calculation and selection of which axis to split on. Then other things like loops over the entries in buckets could be turned into array operations.
- Probably the *biggest* factor in all of this is bucket size. With array operations being tremendously superior to normal ones, it would certainly be the case that much larger bucket sizes would be more optimal for a matlab implementation. Maybe 10x bucket size? It would really take some playing around to find what works best.
- Does matlab have native implementations of heaps? Size-limited heaps are rather good for the task of incrementally creating a list of "top n values", which I take advantage of in my tree implementation in a few places. An implementation of a heap in matlab code would not be worth it however. If there is a native heap implementation use it, otherwise it'll probably be faster to stick other sorted list strategies.
- In my code I have a good number of "fail fast" checks that I've experimentally found to improve performance. In Java/C/etc, they're well worth it because often one can quickly rule out a condition before doing a slower check such as one that requires a distance calculation. In Matlab many of these "fail fast" checks will probably have more cost than a distance calculation, so wouldn't be worth it.
- Pay close attention to general matlab performance advice like here
- Perhaps consider some other approaches like various variants of the R-Tree? I don't see the bucket kd-tree used too often in literature, but yet I could never get my r-tree variant attempts to perform anywhere as near as well as the bucket kd-tree approach. Nevertheless, I think such approaches definitely warrant further investigation.
Do keep in mind that the gains you'd see from a tree compared to the naive approach will be much smaller in a matlab implementation than a C or Java implementation, since the naive approach is simple and well suited to vectorization. It could be the case that a tree won't be worth it at all in Matlab, unless the size of the dataset if particularly huge.
If the performance of an optimized tree is really needed, I believe one can usually link C/C++ code to matlab code. Could this be considered as an option? I've seen such some things in academic context which are primarily written in matlab but use C/C++ code in some performance critical spots. I seems to me that a search tree really is the perfect sort of thing to put in C/C++ code, because it's design doesn't affect the output of the algorithm, just the speed. In fact, when the tree is not the focus of a thesis, I might argue that keeping the search tree in C makes it's application more clear, because someone looking at how the tree is applied may not necessarily be concerned with the nitty-gritty of the n-nearest-neighbors search beyond knowing "and here it does a fast n-nearest-neighbors search" since it's output will be the same whether it's naive or fast. That's just my judgement though, I'm sure many might disagree with me, haha.
Does that help? :) --Rednaxela 04:20, 16 April 2011 (UTC)
That's pretty much it exactly =) I'm estimating my initial 'warm up' scan will add around 1.4 million points, so a naive approach isn't really an option =). One other thing I was thinking about would be if I had a line, rather than a point, how I would store that in a Kd-tree? Put in a whole bunch of points to make up the locus? Reason being, I have several options to store where has known walls and where is known open space. One of them was using lines which essentially trace out my lasers fired from my range finder, with one end marked as 'solid' and the rest marked as 'open space'. As such, when I test a new point cloud in a certain position to see how well it fits, if the line (rather than just the endpoint) is in the top N closest to a point in the cloud it needs to be returned as well. Thanks for all your help =) --Skilgannon 06:25, 18 April 2011 (UTC)
Well, one doesn't need to use a whole bunch of points to do lines. There are two main ways that come to mind to mind to deal with non-point objects (lines, or anything else really):
- Consider a tree that explicitly tracks the bounding boxes of nodes, and uses them instead of split value during the search (mine does this for performance enhancement reasons). Place the lines/objects in the tree based on their center point (doesn't have to be their center necessarily, but seems like as good a choice as any). Then expand the tracked bounding boxes based on the bounding box of the line/object. This means your nodes will have overlapping bounding boxes, but it'll still work really.
- or, alternatively, place an entry for the object/line in every node that "fits". If a line spans over a split, insert a copy on each side. If you are tracking bounding boxes, crop them at the split so that they won't overlap. This approach leads do much duplication and more memory use, but may have better performance during the search due to the lack of overlapping nodes. On the other hand, this approach also requires extra handling to make sure you don't get duplicates in your search results.
Actually, those two approaches are inspired by what I've seen of R-Tree variants. It's common to use them for storing non-point objects for spatial search. The R+ Tree essentially takes the duplication approach I mentioned, and other variants of the R-Tree take the approach of putting just into the one node that fits best. As a note to compare, the kd-tree is to the plain binary search tree, as the r-tree is to the b-tree. If your lines start introducing overlap in your nodes particularly often, I'd really take a good look into r-tree variants because the literature for them is done with overlap in mind (since r-trees inherently tend to have nodes with overlap anyway). --Rednaxela 12:24, 18 April 2011 (UTC)
- [View source↑]
- [History↑]
You cannot post new threads to this discussion page because it has been protected from new threads, or you do not currently have permission to edit.
Contents
Thread title | Replies | Last modified |
---|---|---|
Interval Heap vs MinHeap | 6 | 19:54, 20 July 2013 |
removeOld() not working? Or locationCount not decrementing after removeOld()? | 20 | 18:52, 1 May 2013 |
Null Pointer Exception - 3rd Gen Tree | 5 | 15:30, 26 March 2013 |
Unit tests for KdTree | 8 | 14:02, 29 May 2012 |
I'm curious, why did you go with the MinHeap instead of the IntervalHeap for your path ordering? Surely IntervalHeap can be kept pruned and small, resulting in less overhead?
I hadn't actually thought to try that. The IntervalHeap is something I added specifically when I was thinking about implementing to search iterator, rather than optimizing the search steps themselves.
I just tried it right now. Turns out that the more complex procedure to maintain the IntervalHeap's ordering adds more cost than is saved by keeping the path ordering queue pruned, at least for this size of tree in the 'standard' data set.
I tried two seperate approaches of using the IntervalHeap to keep that queue pruned:
- Remove any paths which are now known to be too distant, whenever we finish processing a leaf node (aka, always keep it pruned as much as possible)
- When we're adding a new path to the queue, if the furthest path in the queue is too distant, replace it to avoid growing the heap (aka, don't actively prune it, merely don't grow it if we can avoid it)
Both approaches performed worse overall than the unpruned MinHeap. Even if I do a run with JIT warmup allowed to compensate for the IntervalHeap's more complex code, it still performs slightly worse on average. When the JIT warmup is allowed however, the "worst case" time appears to improve though, just not the average.
How about keeping a flag in your leaf node processing so that you only try pruning if you added a new item?
Just tried that. The pruning is still not paying off compared to the extra overhead of the IntervalHeap.
And how about if you just use the IntervalHeap, but don't try any pruning? I'm curious whether the pruning is slow, or the different heap.
I was thinking it was the different heap... but actually turns out the heaps are approximately the same performance, with the difference being the pruning:
- #1 Rednaxela's kd-tree (3rd gen, Interval Heap) [0.0290] - #2 Rednaxela's kd-tree (3rd gen) [0.0290] - #3 Rednaxela's kd-tree (3rd gen, Interval Heap, Prune When Points Added) [0.0293] - #4 Rednaxela's kd-tree (3rd gen, Interval Heap, Avoid Growing Heap When Possible) [0.0294] - #5 Skilgannon's Cache-hit KDTree [0.0296]
I've added this KD-tree to my bot as I do a ground-up rewrite of it. I'm not even to the point of extracting k.n.n. data from the tree yet, I was just in the process of testing dropping data into the tree, and noticed something odd. The tree isn't pruning when locationCount exceeds sizeLimit. Or if it is, then it's not being reflected in the value of locationCount.
I've set sizeLimit in the tree instantiation to a mere 35 as a test, and by the end of a 35 round match against Walls, the tree reports containing nearly 2000 nodes. After reading through the code, I don't see a lower bound on pruning the tree other than bucketSize, which is only 24 anyway. I've added a debug print message, and removeOld() is definitely being called, but is returning before the "weird" comment at the end of that method....
I'm sure I'm missing something... but what? Should I presume this is working as intended, despite the reported size() growing without bound?
Okay. A bit of testing with grabbing knn's via nearestNeighbor() shows that even if I request more knn points than. sizeLimit, the function only returns <= sizeLimit points. Excellent! So, is it problematic that locationCount (and thus the output of size() )is going to head without limit for Integer.MAX_VALUE and never gets decremented to reflect the current actual size of the tree? The former isn't much of a concern, RC matches are never that long.. but the latter?
I found the error. In method RemoveOld, the loop condition of the decrementing loop should be
cursor != null // OK
instead of
cursor.parent != null // WRONG
This prevented the count in the root node of being decremented.
Whole loop (corrected):
do { cursor.locationCount--; cursor = cursor.parent; } while (cursor != null);
Also RemoveOld does not adapt minLimit and maxLimit. I don't know what the impact of this would be in practice.
Hey, thanks for finding that bug.
By the way, just so you know, the version with the code on the page is not the version I actively use anymore. The "up to date" version is here under the java package "ags.utils.dataStructures.trees.thirdGenKD". This version is a re-write what is cleaner (IMO anyway), slightly faster, though with a slightly different feature set. It's missing that "remove old elements beyond a size limit" feature because nobody I knew of was using it at the time, but it has some additional features like an iterator to allow one to iterate over the nearest points in sorted order, and if you stop early it saves a notable amount of cpu. The advantage of this iterator is that it can be significantly faster in situations where the number of points you need is uncertain until looking at the closest few points (i.e. if you're filtering out some of the points after the k-nn search based on some other criteria).
Hey Rednaxela,
I'm switching over to your 3rd gen tree, and had a question re: your statement here:
"... it has some additional features like an iterator to allow one to iterate over the nearest points in sorted order, and if you stop early it saves a notable amount of cpu."
How early are we talking about? For instance, if I grab an iterator for 200 maxPointsReturned, iterate over the first 150 of them and decide I'm done, is that still in the "saves a notable amount of cpu" territory?
I'd have to benchmark it to be sure, and it depends on the distribution of data of course, but I'd estimate stopping at 150 out of 200 points would shave off something like 10-15% of the search time compared to just getting 200 points.
IIRC it depends on the structure of your data. From what I understand, it slowly expands the hypersphere of contained points in bursts (grabbing further and further tree branches), sorting them in order of relevancy as it goes. It depends on whether you don't have to do an extra expansion to get the extra points. An iterator for 200, then getting 150, will be slower than getting an iterator for 150, but chances are it will be faster than getting an iterator for 200 and using all 200.
Pretty much yeah, though it does avoid the full effort of sorting them by using a min-max heap that tosses the most distant points off and keeps the closest point accessible in constant time. The search algorithm is exactly the same as searching for all 200 (it needs to remember the 200 closest points it's found so far, and know what the furthest and closest ones of that set are), except that it pauses the search when when it is able to determine that no unchecked branch could have anything closer than the closest point not yet returned by the iterator.
Excellent info from you both, thanks!
One more question: the DistanceFunction file defines an Interface, but features no comments to describe what the two member methods are supposed to do.
distance() is utterly obvious... especially when reading your EuclideanDistanceFunction implementation. But distanceToRect()... I think I know what it requires, but I don't want to screw it up when I write my own DistanceFunction.
What precisely is distanceToRect defined as?
It's the minimum distance from that point you are testing to the hyper-rectangle defined by the min and max co-ordinates on each dimension.
So considering dimension x: if the point val is less than the min, the distance is (min - val), if it is between min and max the value is 0 (it is inside the rectangle), if it is more than max the distance is (val - max).
If you are doing Euclidean, square each distance then add them together. I do Manhattan, so just use the absolute value.
Yep, what Skilgannon said. Sorry I forgot to put comments in that interface.
If your'e wondering, this is used to compare the search point to the bounding box associated with each branch of the tree, and allows efficient skipping of irrelevant branches.
Excellent. I was guessing it was that or related to that from reading your euclidian distance implementation.
When I write a WeightedSquareEuclidieanDistanceFunction and/or WeightedManhattanDistanceFunction, would you like me to commit them to your bitbucket hg repo? I'm happy to help contribute! :)
Sure, or if posted on the wiki or in a bot I can upload it to that some time. Thanks :)
DeBroglie rev0026 is up. Only change from rev0025 is your 3rd gen tree. Should perform pretty close to what it was doing before, though some differences are to be expected since your 3rd gen tree doesn't drop points. We'll see, though I have to get going to the doctor at the moment.
I tested with both the SqrEuclidean and Manhattan versions of my Weighted trees. Both seemed to work fine in several test battles with some bots I had sitting around. I ended up making a WeightedDistanceFunction class to be a superclass of both the WeightedManhattanDistanceFunction and WeightedSquareEuclideanDistanceFunction.. to duplicate less of the code involved in weighting.
The weighted DF should failover gracefully if given weights that mismatch the number of tree dimensions. Only thing I didn't implement was doing a Math.abs() on weights, since someone out there might invent a DistanceFunction that utilizes negative weights.
If the code on my bitbucket fork meets your approval, I can toss you a pull request. :)
EDIT: Made a new bitbucket with all the work in a single commit, and decided to make WeightedDistanceFunction abstract.
Alright. I've got them written on a fork of your bitbucket repo. Once I test them in DeBroglie to be sure they work, I'll drop a pull request to you. :)
!!!! That's a feature that I've really been wanting for an idea I had for DeBroglie's gun! Excellent! You ought to edit the main page to mention that the most up-to-date version of the code is available off-wiki. Thanks for the heads-up Rednaxela!
Seems pretty great - is the 3rd gen tree threadsafe? The earlier iterations do not appear so.
Hiya,
Getting an occasional null pointer exception. I've not worked out why its happening, and looking at the source I can't quite see how it could happen. Call stack looks like
at KDTree.KdNode.addLeafPoint(KdNode.java:70) at KDTree.KdNode.splitLeafNode(KdNode.java:173) at KDTree.KdNode.addLeafPoint(KdNode.java:79) at KDTree.KdNode.addPoint(KdNode.java:63) at KDTree.KdTree.addPoint(KdTree.java:1)
Any ideas?
Seems that the "points" array is getting set to 'null' in a newly created child node. Assuming you haven't modified the code of the tree, and you're not doing any multithreading things, I can only see one possible cause: *All* of the data points are being put in the same child node, causing the newly created child node to itself be split while the upper level's splitLeafNode is still running. This implies that calculateSplit() is returning true when it shouldn't.
Could you add code to line 169 to print the list of coordinates stored in the node being split, or use a debugger to obtain similar information? My suspicion is that the coordinates you're putting into the tree are very unusual in some manner, possibly involving NaN values.
Sorry for the wall of numbers. Here are the values inside the splitLeafNode function:
Points array: [ [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.27476300750221805, 0.0, 0.5, 0.5, 0.05944248946787964, 0.05944248946787964], [0.25781116652220715, 0.08692892561404503, 0.49442777954156447, 0.525842295924762, 0.06786100210077804, 0.14692687699385826], [0.2578177856998261, 0.08372619661301722, 0.49463837126786636, 0.5064605739811905, 0.06785023783184158, 0.1469334424631162], [0.2578214298492213, 0.08292549845220197, 0.4946909579122118, 0.5016151434952976, 0.06784441533138576, 0.14693385451422164], [0.25782433041558794, 0.08272532316252756, 0.49470410169693013, 0.5004037858738244, 0.06783982767738157, 0.14693272660841677], [0.25782704507786713, 0.0826752795385066, 0.4947073884210326, 0.500100946468456, 0.06783554880964597, 0.14693121360417843], [0.25782971324440696, 0.0826627688903635, 0.4947082111092713, 0.5000252366171141, 0.0678313472047577, 0.14692960432136848], [0.2578323697665857, 0.08265964148995872, 0.4947084178030773, 0.5000063091542785, 0.06782716496882069, 0.14692797097446295], [0.25783502335720493, 0.08265885990173155, 0.49470847049921196, 0.5000015772885695, 0.06782298762433847, 0.14692633161848062], [0.2578376761944629, 0.0826586647665601, 0.49470848469597223, 0.5000003943221424, 0.06781881155096657, 0.14692469076738635], [0.25784032882290964, 0.08265861624464657, 0.49470848926786565, 0.5000000985805355, 0.06781463584332002, 0.14692304954976892], [0.2578429813786835, 0.0826586043760398, 0.49470849143351225, 0.5000000246451339, 0.06781046027498529, 0.14692140824779062], [0.2578456338958197, 0.08265860167075158, 0.4947084929975659, 0.5000000061612835, 0.06780628478936644, 0.1469197669319714], [0.2578482863828274, 0.08265860125628488, 0.49470849441118986, 0.5000000015403209, 0.06780210937228809, 0.1469181256199622], [0.2578509388418348, 0.08265860141451545, 0.49470849578717485, 0.5000000003850803, 0.06779793402021453, 0.14691648431616464], [0.2578535912733745, 0.08265860171591212, 0.49470849715371845, 0.5000000000962701, 0.06779375873223772, 0.14691484302170008], [0.2578562436775801, 0.08265860205309222, 0.4947084985178703, 0.5000000000240675, 0.0677895835081634, 0.146913201736816], [0.257858896054486, 0.08265860239921004, 0.49470849988139254, 0.5000000000060169, 0.06778540834789157, 0.14691156046162138], [0.2578615484041011, 0.0826586027475541, 0.49470850124472593, 0.5000000000015042, 0.06778123325144109, 0.1469099191960998], [0.25786420072642846, 0.08265860309644661, 0.4947085026079805, 0.500000000000376, 0.06777705821881806, 0.1469082779402436], [0.2578668530214694, 0.08265860344546808, 0.4947085039711838, 0.500000000000094, 0.06777288324998178, 0.14690663669408965], [0.25786950528922475, 0.08265860379451359, 0.4947085053343428, 0.5000000000000234, 0.06776870834495623, 0.1469049954576125], [0.25787215752969567, 0.08265860414355704, 0.49470850669745914, 0.5000000000000058, 0.06776453350371553, 0.14690335423083412], [0.2578748097428828, 0.08265860449259181, 0.4947085080605333, 0.5000000000000014, 0.06776035872627743, 0.1469017130137357], [0.25787746192878697, 0.08265860484161626, 0.4947085094235653, 0.5000000000000003, 0.0677561840126425, 0.1469000718063138], [0.2578801140874092, 0.08265860519062998, 0.4947085107865553, 0.5000000000000001, 0.06775200936277642, 0.14689843060859753], null ] bucket capacity is 50 Size is currently 48 Split dimension is 5 Split value is 0.10318817199105064
Yeah, that null there is not a problem, though it does look like I accidentally made the nodes split one entry too early. That wouldn't cause any issue besides a tiny amount of wasted memory though.
Later today I'll see if I can replicate the issue with that list of points.
Hello Rednaxela,
I just converted your KD-tree Java code to C#. Do you have unit tests for KD-Tree? They would allow me to test if the conversion succeded.
I don't know if you could modified it to run the C# implementation, but K-NN algorithm benchmark has an accuracy checker with linear search, as with performance benchmark. You can give it a try.
Yeah, I didn't use any unit tests when writing the KD-Tree, just some functional tests. The most thorough of which is from the K-NN algorithm benchmark that Nat and I wrote. It repeatedly tests the KD-Trees with a very large set of random set of points against known-good results from a very very simple linear-time k-nn search.
I just published my C# implementation of a KD-Tree based on Rednaxela's java implementation here on robowiki.
Hi Rednaxela & Ojd,
I'd like to have remove functionality and added the method below to KdNode. It is based on Ojd's C# variant and adjusts. KdNode needs and extra field: parent. It seems to do what I want, but feel not confident about some points:
- do the bounds of parent nodes to be adjusted as well?
- how to add it nicely to the NearestNeighborIterator?
- what if you want to re-weight after n removals? (remove remainder and add again?)
Thanks, Arie
public boolean remove(double[] point, T value) { KdNode<T> cursor = this; while (!cursor.isLeaf()) { if (point[cursor.splitDimension] > cursor.splitValue) { cursor = cursor.left; } else { cursor = cursor.right; } } for (int i = 0; i < cursor.size; i++) { if (cursor.data[i].equals(value)) { double[] cursorPoint = cursor.points[i]; // remove this points cursor.data = ArrayUtil.remove(cursor.data, i); cursor.points = ArrayUtil.remove(cursor.points, i); // fix tree size all the way up to the root KdNode<T> parent = cursor.parent; do { parent.size--; parent = cursor.parent; } while (parent != null); // adjust _minBound and _maxBound after removing an item. if(cursor.points.length > 0){ for (int j = 0; j < dimensions; j++) { if (Double.isNaN(cursorPoint[j])) { if (!Double.isNaN(minBound[j]) || !Double.isNaN(maxBound[j])) { singlePoint = false; } minBound[j] = Double.NaN; maxBound[j] = Double.NaN; } else if (minBound[j] > cursorPoint[j]) { minBound[j] = cursorPoint[i]; singlePoint = false; } else if (maxBound[j] < cursorPoint[i]) { maxBound[j] = cursorPoint[i]; singlePoint = false; } } } else{ // last point at this cursor was removed. minBound = null; maxBound = null; } return true; } } return false; }
To answer your questions:
- Actually, technically you don't need to adjust ANY bounds at all. It won't affect the correctness of the result. The only consequence is that the searches may be slightly slower than optimal. If you are adjusting the bounds though, you might as well adjust the parent ones as well, because that's where a large chunk of the (smallish) performance impact would be.
- On one hand adjusting the bounds makes the removal slower, but on the other hand it makes the search a little faster. To really know if it's worth adjusting the bounds, would probably require benchmarking with some code that's making realistic use of the removal.
- To add it nicely to the iterator, I think you'd need to have the iterator keep track of the last point it returned, so it can call a remove on that.
- If I understand what you mean... You should note that the structure of the tree is such that you can't just remove a leaf node, because every node must either be a leaf node or have two child nodes. To prune back the tree after removing items, you'd need to implement a "merge siblings" algorithm. This algorithm would need to first check that the combined size of both would fit comfortably in one node, and if that's the case basically perform the reverse of what "splitLeafNode" does. It'll also need to take into account the possibility that one sibling may not be a leaf node while the other is.
Some other notes:
- You don't appear to be changing the "size" of the node itself, just the parents, which would be a problem. I also don't know that "ArrayUtil.remove" is what you want there.
- Note that the "data" and "points" arrays have extra capacity beyond what's used and that "size" (or "_count" in the case of Ojd's C# variant) tracks the number of elements actually used.
- (I also can't seem to find any documentation of a "ArrayUtil.remove" in either Java or C# so I'm a bit confused)
- The code to adjust the bounds there appears to be incorrect. It looks like what's happening in that code is extending the bounds to contain the removed point... and the bounds already contain that anyway.
Thanks for your quick reply!
Good to know that the bounds don't affect the correctness. I agree, then it is a performance trade-off to adjust them or not.
You only really need to perform "merge siblings" when ALL points from the node are removed, isn't it?
Indeed, the size of the node itself must be adjusted as well:
KdNode<T> parent = cursor.parent; do { parent.size--; parent = cursor.parent; } while (parent != null);
Also ArrayUtil.remove is my own method:
/** * Remove the element from the array at location i. * @param as original array. * @param i index indicating the element to be removed. * @return new array with length (as.length-1) if a is contained in as, or else a copy of the original array. */ public static int[] remove(int[] as, int i){ int n = as.length; if(i < 0 || i >= n) { return Arrays.copyOf(as, n); } int[] copy = new int[n-1]; System.arraycopy(as,0,copy,0,i); System.arraycopy(as,i+1,copy,i,n-1-i); return copy; } public static int[] remove(int[] as, int i){ int n = as.length; if(i < 0 || i >= n) { return copyOf(as, n); } int[] copy = new int[n-1]; System.arraycopy(as,0,copy,0,i); System.arraycopy(as,i+1,copy,i,n-1-i); return copy; }
Removal is O(n), but you cant do much better. There is also a variant for double[] and double[][].
The bounds indeed really need to determine the min and max by iterating over all points.
Back to the keyboard...
Well, to be clear the bounds affect the correctness if they are *too small*, but the bounds being too big (as is the case after removing) has no impact.
Really, you don't have to "merge siblings" when all points are removed from a node even, and it could (for sake of tree balance) be worth merging sibling nodes before they're completely empty (i.e. if both nodes had half their points removed)
With regards to the removing from the array, actually you can do it in O(1) time due to the fact that 1) unused array elements are expected, and 2) the array of points in a leaf node are unordered. You just need to overwrite the removed element with the last element, and then remove the last element:
points[i] = points[size-1];
data[i] = data[size-1];
points[size-1] = null;
data[size-1] = null;
size--;
(Oh, and when removing elements, be careful with how you're looping over them at the same time. You need to decrement "i" when the remove occurs, because what points[i] refers to changes when you remove the element)
If you wanted to take a look at Gilgalad, I think I have this working correctly in my tree. However, the feature is never used since it is faster to just have one large bucket unless the size limit is greater than about 10000 or so, and if I used that many points I would probably just use them all.