Interval Heap vs MinHeap
The highlighted comment was created in this revision.
I'm curious, why did you go with the MinHeap instead of the IntervalHeap for your path ordering? Surely IntervalHeap can be kept pruned and small, resulting in less overhead?
I hadn't actually thought to try that. The IntervalHeap is something I added specifically when I was thinking about implementing to search iterator, rather than optimizing the search steps themselves.
I just tried it right now. Turns out that the more complex procedure to maintain the IntervalHeap's ordering adds more cost than is saved by keeping the path ordering queue pruned, at least for this size of tree in the 'standard' data set.
I tried two seperate approaches of using the IntervalHeap to keep that queue pruned:
- Remove any paths which are now known to be too distant, whenever we finish processing a leaf node (aka, always keep it pruned as much as possible)
- When we're adding a new path to the queue, if the furthest path in the queue is too distant, replace it to avoid growing the heap (aka, don't actively prune it, merely don't grow it if we can avoid it)
Both approaches performed worse overall than the unpruned MinHeap. Even if I do a run with JIT warmup allowed to compensate for the IntervalHeap's more complex code, it still performs slightly worse on average. When the JIT warmup is allowed however, the "worst case" time appears to improve though, just not the average.