Great and very detailed article!

Thanks for your work! This may be one of the briefest material about PIF, and could be one step toward popularizing melee battles, like the well-known GF tutorial & WS tutorial. 


Anyway, two questions: 

1. Is this algorithm faster than simply looping through all the scans before bft, in real world? Since either looping through all of them or branching for the closer need the entire memory of scans in bft to be loaded in cache to perform very good, and in cases that the entire BFT scans is not suited in cache, there will be badly several cache misses which slows the entire process down, if you don’t access memory in a pretty predictable manner. Anyway, this is only guess, so benchmark is still needed. 

2. If I understood this algorithm correctly, it requires scans at time s + 0, s + 1, ... , s + bft to be all stored in array, being continuous. But in melee if we only store actual scans, looping through all the required scans is already getting a very small constant factor. If we instead store every scans and interpolated scans, it maybe harder for the CPU to access that data — e.g. cache misses because the increased size. So again, comparing to inconsistent scan version, is this algorithm still faster in real world? 


Btw, in the pseudo code I could see that the data is actually stored in objects, which is accessable from an array of pointers — then one step of it may yield a cache miss — meanwhile the loop through contiguous memory may yield no cache miss, although with more read operations from CPU cache.