Reason behind using Manhattan distance
← Thread:Talk:DrussGT/Understanding DrussGT/Reason behind using Manhattan distance/reply
That was quite a while ago :-) But I know I tested a lot of different distance functions, including exotic things like multiplicative and log-based, and Manhattan worked best. I'm fairly sure I used Euclidean with a sqrt on the squared distance.
Having a gun that is different from what people expect is helpful, since the tuning they do doesn't affect you as much. This is my guess why Manhattan worked best for me
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Return to Thread:Talk:DrussGT/Understanding DrussGT/Reason behind using Manhattan distance/reply (2).
Just had a thought about DrussGT's hundreds of random VCS bins and Manhattan distance —
Consider we have infinite amount of random VCS buffers (random bin size and dimensions, weighted equally, no decay), then 1 distance increment in a dimension result in "1" decrease in the total of buffers (data weight) containing that data.
When distance increased in dimension A by 1, and distance increased in dimension B by 1 as well, then data weight decreased by 1 + 1 = 2, in the same way manhattan distance works.
If we use manhattan distance together with knn, and decrease weight linearly on data distance, it should yield similar result to random VCS.
However, once rolling average (decay) is used, things get a lot different there...