?????
It's a bit off topic, but I'm curious, in what way is the normalized euclidean distance search incompatible? I have a "DistanceFunction" interface that allows other distance metrics to be implemented. It should even be just fine with the scaling of different dimensions changing over time (it just makes the tree splits slightly less optimal, but not a big deal generally).
Is it fine to call setWeights() before getting a nearest neighbors list, then using setWeights() again to return to the original weighting?
I was hoping to implement an anti-surfer gun with custom dimension weights.. but just to use those weights at getKNN time.
Yep, that should work just fine.
(Note, the newer version of the tree doesn't have the "setWeights()" thing but it would be trivial to include inside a "DistanceFunction" implementation.)
Thanks for the info! Replacing the wiki version of your tree for the bitbucket one is one of the very next things I'm doing. I need that iterator now, because I'm occasionally culling too many neighbors via displacement vector, leaving me with too-sparse data. The ability to fetch more will be invaluable!
A faster way of doing this would probably be to have two different trees, one with weights of one type and another with weights of another type. This will ensure that your tree splits are more optimal, and you can use completely different dimensions as well without a slowdown.
Interesting idea! If I can spare the cycles, that would be interesting to do. Are there segmentations that are useful against adaptive movements but utterly superfluous against non-adaptive movements? (other than data age?) Vice versa?
I'm unsure if that would be faster really. Since these trees are generated on-the-fly without rebalancing, it's not like they're hugely optimal in the first place. I doubt the overhead of changing the weighting would be much compared to how non-optimal it is in the first place. Additionally two trees could be worse from the perspective of on-CPU caches. IMO it's worth trying both ways and benchmarking.
If balancing the tree means better performance, would the end of a round (after death or victory) be a good time to balance the tree, or would it even make a difference in terms of performance during the match? I suppose it all depends on how hard one's bot is pushing up against the edge of the time slice it's allowed.
The end of the round probably would be a good time yes, though I've never really tried it. The re-balancing would take longer each round though, so it may make sense to only do that for the first few rounds of battle. As the number of rounds increases there would be diminishing returns to constantly re-balancing anyway.
The first few rounds makes sense.. Relatively quickly your dimension balancing should conform to the movement pattern of the enemy, and later rounds probably wouldn't shift the balance dramatically.
The distance dimension of a bot that heavily prefers a certain orbit distances will get properly split after just a few re-balancing operations, making further balancing on that dimension a comparative waste of time I would imagine.
I´m using normalized Euclidean distance (weight = 1/stddev) with fast sampled deviation, which do change over time (k-NN for lazy people who don´t fine tune their bots).
I had the impression the changing weights were not being accounted for. My bot´s scoring dramatically decreased with the kd-tree. I suspect it is the cached coordinate limits inside each node. Or maybe I screwed up something while trying to use the library.
Changing weights is certainly intended to be accounted for, though I hadn't tested it as much as other features. I know it couldn't be the coordinate limits inside each node, because they are copying the min/max values from the points which don't change with weighting. Do you remember which version of the tree were you trying this with?