Effectiveness
← Thread:Talk:Variable bandwidth/Effectiveness/reply (2)
I tried something like this for gun, but it didn't really work out. I ended up getting much better results using a square kernel. From Wikipedia, if you assume that your distribution is Gaussian then the optimal bandwidth would be:
- <math>h = \left(\frac{4\hat{\sigma}^5}{3n}\right)^{\frac{1}{5}} \approx 1.06 \hat{\sigma} n^{-1/5}</math>, where <math>\hat{\sigma}</math> is the standard deviation of the samples and <math>n</math> is the number of samples.
Perhaps this would work well for movement, where there is much less data to work with. It might also be necessary to add some sanity checks to the calculated h value in case there is only 1 or 2 samples, etc.
Of course, I'm fairly sure our distributions are not at all Gaussian, or even uni-modal, so this formula might not be relevant at all.
I've considered something like this before but, when you have less than 20 samples or so, your estimate of standard deviation itself is going to have a large amount of uncertainty. I suspect one would need to determine the "typical" standard deviation for most bots, use that as the initial value, and slowly transition to a value calculated from the data.
Regarding the distribution not being gaussian, indeed it wouldn't be... but I think that formula may still be somewhat roughly applicable if we apply a correction factor for how the uncertainty gets smaller near the maximum escape angle, perhaps modeled off of the behavior of the binomial distribution near the edges.
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Return to Thread:Talk:Variable bandwidth/Effectiveness/reply (7).
Hmm... at least with sufficient data I'd agree that applying bandwidth estimation separately to different clusters would make sense. The cases where the appropriate bandwidth changes most significantly would be when there's limited data though (each additional data point doesn't change uncertainty as much once there are already many data points).