Great and very detailed article!

Jump to navigation Jump to search

Great and very detailed article!

You do not have permission to edit this page, for the following reasons:

  • The action you have requested is limited to users in the group: Users.
  • You must confirm your email address before editing pages. Please set and validate your email address through your user preferences.

You can view and copy the source of this page.


Return to Thread:Talk:Binary Search PIF/Great and very detailed article!.

Algorithmic optimisation will pretty much always outperform low level optimisation's such as optimising memory access for cache misses.

You really shouldn't need to worry about cache misses in Java!

Optimising from o(n) to o(log n) will give a big performance benefit!

Wolfman (talk)20:01, 3 November 2017

Yes, I agree with that. It just happens that in a scenario where you get on average one scan every 4-5 ticks, and the average BFT is 50 or less, even the theoretical improvement becomes negligible. But I'm a guy who likes to have the worst case situations nicely covered :P

The interesting question for me is: "does this make my bot run faster?"

I do not have this answer, I only know that this helps me not skipping turns because of odd worst case situations.

Rsalesc (talk)21:14, 3 November 2017

Well, I think the worst cases is not about bft, but the entire round time. BFT is too small to make you skip a turn, but a bug most bot authors make could make the worst case round-long.

The catch-point is, how do you handle data from different round?

Xor (talk)00:46, 4 November 2017

Yeah, it is, my gun was pretty slow in melee. Idk if I got what you mean. Can you clarify?

Rsalesc (talk)01:10, 4 November 2017

e.g. You store the information of the next round right after the first round, and when the scans of the first round isn’t enough to get a hit, you continue searching scans from the next round and start from time = 0 to time = movie start time + bft.

if you store time as globaltime, this will only result in inaccurate result which may be eliminated by kde. But if you store round time, it will cause the data of the entire round be iterated.

Xor (talk)04:09, 4 November 2017

When the data isnt enough I just stop. If im binary searching I guarantee that its domain is entirely inside a single round. I dont even consider the scans of the next round and I discard that situation. Then I keep picking matches from the tree with an iterator.

Rsalesc (talk)04:45, 4 November 2017

Of course it is not only abount the BFT time, we still have a Kdtree, and the other components, but when we are talking about milliseconds it helps a lot.

Rsalesc (talk)01:19, 4 November 2017

kd tree is very fast, comparing to the cost of simulation imo.

Xor (talk)04:10, 4 November 2017

Not as fast as the presented algo for sure.

Rsalesc (talk)04:34, 4 November 2017

Well, that’s only true for large enough n... And for small n, such as our cases, constant factor is dominant.

Btw, memory access is WAAAY expensive than basic calculations, so the gain for optimized memory access, for small n, often outperforms paper algorithms that don’t use contiguous memory in order.

Xor (talk)00:43, 4 November 2017

1. I agree with that, a benchmark is needed here and I can even provide more than one implementation of this algorithm. Ill try to do this when Im home. Any idea on how should I benchmark this? Real world melee data or randomly generated?

2. The gain of theoretical speed increases as the size of our movie increases. So yes, in 1v1 I'm almost sure it is faster in practice, but I can't say the same about melee. Notice, though, that the number of iterations is in the order of <math>\log K</math>, where K is the number of inconsistent scans between 0 and BFT. It just happens that the worst case is when you have BFT scans. You do not need to store the interpolated scans in this array. You can just do a single interpolation after the algo is done, if you are linearly interpolating finding the impact point is very simple. So in terms of iterations, the difference is still good. But yeah, the difference becomes less and less noticeable as our movie gets sparser, even less if we do not be careful about cache misses.

3. The data is stored in objects and I do that in my code as well, but I would say it is ok to store the needed information in contiguous arrays as well, I just find it ugly.

Feel free to provide any other insight about this and even to post your implementations of this you find them useful! :)

Rsalesc (talk)21:04, 3 November 2017

Well, using real battles to benchmark often give you pretty high margin of error if not done properly... Anyway run 100 seasons against RaikoMicro and see the total time seems to say something about the overall performance. And using percentage run time, e.g. PIF time / total run time of your bot may be even better.

Btw, 1000+ highly optimized iterations as worst cases shouldn’t cause you skipped turns, but if you don’t use contiguous memory, and access that in order, several 1000+ cache misses in one turn is enough to kill you imo.

Xor (talk)01:22, 4 November 2017