Great and very detailed article!

Jump to navigation Jump to search

Great and very detailed article!

Thanks for your work! This may be one of the briefest material about PIF, and could be one step toward popularizing melee battles, like the well-known GF tutorial & WS tutorial.

Anyway, two questions:

1. Is this algorithm faster than simply looping through all the scans before bft, in real world? Since either looping through all of them or branching for the closer need the entire memory of scans in bft to be loaded in cache to perform very good, and in cases that the entire BFT scans is not suited in cache, there will be badly several cache misses which slows the entire process down, if you don’t access memory in a pretty predictable manner. Anyway, this is only guess, so benchmark is still needed.

2. If I understood this algorithm correctly, it requires scans at time s + 0, s + 1, ... , s + bft to be all stored in array, being continuous. But in melee if we only store actual scans, looping through all the required scans is already getting a very small constant factor. If we instead store every scans and interpolated scans, it maybe harder for the CPU to access that data — e.g. cache misses because the increased size. So again, comparing to inconsistent scan version, is this algorithm still faster in real world?

Btw, in the pseudo code I could see that the data is actually stored in objects, which is accessable from an array of pointers — then one step of it may yield a cache miss — meanwhile the loop through contiguous memory may yield no cache miss, although with more read operations from CPU cache.

Xor (talk)18:59, 3 November 2017

Algorithmic optimisation will pretty much always outperform low level optimisation's such as optimising memory access for cache misses.

You really shouldn't need to worry about cache misses in Java!

Optimising from o(n) to o(log n) will give a big performance benefit!

Wolfman (talk)20:01, 3 November 2017

Yes, I agree with that. It just happens that in a scenario where you get on average one scan every 4-5 ticks, and the average BFT is 50 or less, even the theoretical improvement becomes negligible. But I'm a guy who likes to have the worst case situations nicely covered :P

The interesting question for me is: "does this make my bot run faster?"

I do not have this answer, I only know that this helps me not skipping turns because of odd worst case situations.

Rsalesc (talk)21:14, 3 November 2017

Well, I think the worst cases is not about bft, but the entire round time. BFT is too small to make you skip a turn, but a bug most bot authors make could make the worst case round-long.

The catch-point is, how do you handle data from different round?

Xor (talk)00:46, 4 November 2017

Yeah, it is, my gun was pretty slow in melee. Idk if I got what you mean. Can you clarify?

Rsalesc (talk)01:10, 4 November 2017

e.g. You store the information of the next round right after the first round, and when the scans of the first round isn’t enough to get a hit, you continue searching scans from the next round and start from time = 0 to time = movie start time + bft.

if you store time as globaltime, this will only result in inaccurate result which may be eliminated by kde. But if you store round time, it will cause the data of the entire round be iterated.

Xor (talk)04:09, 4 November 2017

When the data isnt enough I just stop. If im binary searching I guarantee that its domain is entirely inside a single round. I dont even consider the scans of the next round and I discard that situation. Then I keep picking matches from the tree with an iterator.

Rsalesc (talk)04:45, 4 November 2017

Of course it is not only abount the BFT time, we still have a Kdtree, and the other components, but when we are talking about milliseconds it helps a lot.

Rsalesc (talk)01:19, 4 November 2017

kd tree is very fast, comparing to the cost of simulation imo.

Xor (talk)04:10, 4 November 2017

Not as fast as the presented algo for sure.

Rsalesc (talk)04:34, 4 November 2017

Well, that’s only true for large enough n... And for small n, such as our cases, constant factor is dominant.

Btw, memory access is WAAAY expensive than basic calculations, so the gain for optimized memory access, for small n, often outperforms paper algorithms that don’t use contiguous memory in order.

Xor (talk)00:43, 4 November 2017

You do not have permission to edit this page, for the following reasons:

  • The action you have requested is limited to users in the group: Users.
  • You must confirm your email address before editing pages. Please set and validate your email address through your user preferences.

You can view and copy the source of this page.

Return to Thread:Talk:Binary Search PIF/Great and very detailed article!/reply (2).

Well, using real battles to benchmark often give you pretty high margin of error if not done properly... Anyway run 100 seasons against RaikoMicro and see the total time seems to say something about the overall performance. And using percentage run time, e.g. PIF time / total run time of your bot may be even better.

Btw, 1000+ highly optimized iterations as worst cases shouldn’t cause you skipped turns, but if you don’t use contiguous memory, and access that in order, several 1000+ cache misses in one turn is enough to kill you imo.

Xor (talk)01:22, 4 November 2017