The hitScans list and distance/danger weighting accumulation in getDanger() is very interesting. It looks like this is to avoid the sparse data problem caused by heavy segmentation which is exacerbated in the case of surfing by very few data points.
And I right in assuming this implementation is to achieve a similar effect to the kd-tree you would use in a mega bot?
In any case this gives me an idea, which like most of my ideas will probably come to nothing, but you can now feel guilty for making me waste another couple of weeks on robocode ;-)
I can't think of an easy/small way of doing KNN with this style of implementation, so I guess my distance weighting algorithm had better be a good one.
Yes, in fact I would say that a KD-Tree is an implementation/optimisation of this, and not the other way around.
Unfortunately, this doesn't work in a gun because a gun has an order of magnitude more data and it ends up being too slow. Hence me only using it in movement, and the gun being PM. But go for it. I wanted to try this in Micro as a gun, but I could never get it to fit.
Yes, that was my thought. A lot of minibots do GF/wave for gun and surfing, usually sharing a lot of the logic.
The obvious issue, as you point out, is that the gun has heaps more data than the surfing (probably 2 orders of magnitude), usually a wave every tick, so without the fast searching allowed by something like a kd-tree you end up with a slow full iteration. Additionally the fact that your data is far less sparse means you don't get empty bins with high segmentation, just not necessarily the best/closest match.
I have a silly idea that may possibly speed up the search, so I am going to do a quick and dirty test by plugging the mechanism into Cunobelin to see if I can get it to work at all. Then if that succeeds I'll try to build it into a gun, but I won't know for sure until I try the full implementation and see if it is fast enough to avoid lots of skipped turns.
Before I spend too much time on it, perhaps you could give me some idea of whether it is worth the effort. I.e. when you shifted your druss gun from traditionally segmented GF to using a kd-tree, was there a noticeable improvement. I am assuming even in guns it must help as: (a) several bots are now using kd in guns and (b) some other bots like Pugilist achieve a similar effect by having separate "fast" and "slow" segmentations.
Yes it should help in guns, but I think it would increase the hit rate by at most 1 percent vs a good VCS gun (Targeting_Challenge_RM/Results), which should translate to less than 1 APS if your movement is good (I think, not sure on that, but it's my general impression). Also I think I can prove (ie. I didn't bother to write it down and double check) that the benefit is incidental, and if you perfectly configured a VCS GF gun, it would be better. The catch is that setting up a KNN gun is a lot easier than setting up a VCS gun perfectly.
As AW says, movement is always more important than guns. If you can dodge every enemy bullet it will get you a higher score than if you can hit them 100%. What strikes me as strange is that there are only 5 bots above Cotillion in the minirumble, despite having double the available codesize.
That's an interesting comment - I actually think VCS guns are much easier to configure, which is why it took a long time (years?) for top KNN guns to catch up to top VCS guns even after a lot of us were playing with them. I agree that the top guns of each style are not too different at this point, though I think DrussGT's and Diamond's guns are pretty far ahead of the rest and are both KNN.
Maybe I just had a better idea of what I was doing when I made my KNN gun, but I really think there's more to it than that. In KNN, you just decide what attributes are important, and then, if you want to, try to guess the importance. In VCS you decide which attributes to use in each set of segmentation, then how fine that segmentation should be, and the conditions for using that segmentation.
A quick outline of my proof that a perfect VCS dominates (or should dominate, in that it has identical or better data) KNN. Assuming infinite memory and CPU resources, you create every possible set of segmentations for VCS given certain attributes. When aiming, select the histogram that is centered on the current data point with the size of each dimension's bin so that it includes all points that would be included in the KNN search. It contains at least all points from the KNN search (it may contain more if there are multiple points exactly the same distance from the query point) Obviously, this assumes a range search with a hyper rectangle, but a more complicated algorithm could do the same thing for a hypersphere.
In my opinion KNN is actually easier to get good results with but much more difficult to get spectacular results with. My current KNN gun in Nene is pretty much bare bones (being from Mint). But I had started to hit a tweaking wall with it despite only having 'good' performance.
I strongly doubt that you hit a tweaking wall with your attributes. Try:
making sure all of them are calculated correctly
working on your segmentation for wall proximity
perhaps tweaking how your bandwidth and size of k
The difference between VCS and KNN is huge, if you have the time and space to tune it. Seriously, even a double-buffered VCS is nothing compared to KNN. It takes many buffers of varying depth, granularity etc to get close to what KNN can achieve, and even then it isn't as good. See targeting challenges, I believe the highest scoring VCS is Dookious.
I know Hydra and Waveserpent use a combination VCS/KNN gun, which segments on velocity and acceleration but does KNN on everything else. That way it doesn't need a tree, but still runs at reasonable speed because of the reduced data size.
Though I would note that the difference between VCS and KNN is significantly reduced when you use interpolated VCS like I did in SaphireEdge/Midboss, later adopted by WaveSerpent starting in version 2.0
True. But speed-wise, that is about as slow as KNN, because you have to smooth across all of the dimensions. Of course, the slow part is the adding data, not so much retrieval, unlike KNN.
Actually, I'd say it's somewhat slow on both adding data and retrieval. It's checking adjacent bins in all dimensions for adding and retrieval, leading to <math>2^n</math> entries checked for each operation, where n is the number of dimensions. If anything, retrieval is a bit slower, because for each entry you need to process all guessfactor bins, whereas for adding data you only need to process guessfactor bins near the hit.
The intention was to save codesize by being able to use the same wave and stats functions for both gun and surfing. I have tried my algorithm in Cunobelin as proof of concept and remarkably (after fixing one div by zero bug I created) it worked and got 50% scores against the original.
However it neither saved on codesize nor execution speed, so the idea is probably a no starter as without that change I would have too direct a copy of cunobelin's movement and skipped turns due to data volumes in the gun. Turns out modern processors do very fast FP and the implementation of Integer.bitCount() has too many steps.
I have another idea about weighting which might work, but the performance problem will remain. I will eventually try it, probably along with culling old data to mitigate volume, but it is a lower priority now that my "good" idea turned out not to help.
For your amusement here is the idea:
Replace the EnemyWaveDC.scan array of 5 (actually 4, the other is a special case) doubles with a single integer where that segmentation data is encoded using individual bits as tally marks (i.e. 0b1 = 1, 0b11 = 2, 0b111 = 3 etc using int encoded = (1 << value) - 1;). The values are then assembled with | and <<. Note that this is not the same as simple adding as the positional information (i.e. which segment the data belongs to) is retained, as it must be for the XOR below to work.
The point of this is then the loop over those 4 doubles in the distance calculation in checkDanger is no longer required and the manhattan distance can be calculated with Integer.bitCount(surfWave.scan ^ scan);
I had hoped that using integers rather than doubles and removing the loop would provide a speed increase, but unfortunately FP is very fast and bitCount does a lot (20?) operations, so no gain :(
That's pretty cool, reminds me of what I am doing in Cotillion's gun. Although, I think there would be problems with some bits being worth more than others unless you used some variation on Gray codes in your XOR. Perhaps doing the XOR, then disassembling the result and adding them back together as a single integer representing distance would be faster than simply using bitCount? Although, even then I think there might be issues with the distance metric not being correct.
One thing to be careful of, don't test a gun against surfers if you want to know how good it is. In fact, if you gun is good at hitting surfers, chances are it will be pretty miserable at hitting regular random movement. Rather test against RaikoMicro, and check what CunobelinDC got in the rumble as a comparison.
Edit: I misread that, it shouldn't have problems with different bits being worth different amounts, although you're limited to 8 values per attribute, if you want to stick 4 in an
int. Gray codes should still work, and give you much finer granularity.
Yes, it does work just fine, and you weight the importance of the dimensions just as you would with doubles, just having some slight issue with granularity when only a small number is available. I had 11 for latv, which is fine, but for the weighting I wanted only needed 2 or 3 for distance, but that would have been a granularity of 500 or 300 which is a little too coarse so I compromised on 4 bits.
The gray codes do not work for the simplistic XOR/bitcount mechanic and computing gray->binary looks expensive, so even if I could solve the math it probably wouldn't help.
If it's not too much codesize, maybe this would be faster?
Not sure if it would be faster, but in any case codesize will be an issue, so adding any but the most trivial function is probably not feasible.
The only other idea which would be much faster is a lookup table, but that would be 4GB, and if you partitioned to byte or short to get the size sensible you get more codesize and slower performance.
As others have said, VCS and KNN have similar performance in their most advanced forms, but it's tough to compare what a low code size version of each would look like. But it is worth considering the performance of Komarious vs CunobelinDC (aka Toorkild) guns in the TCRM. Komari's gun is way stronger, but we can see how much that matters in the rumble. :-)