Thread:Talk:DrussGT/Understanding DrussGT/Reason behind using Manhattan distance/reply (8)
m (Reply to Reason behind using Manhattan distance)
Latest revision as of 17:15, 28 August 2018
I think it is due to the noise rejection. For me it is the ratio between how a small change in a lot of dimensions is weighted compared to a big change in a single dimension, as you demonstrated above. You can also think about it like the difference between L1 and L2 distance, how they would affect a minimization problem. L1 rejects large noises, and is the most robust you can get while still maintaining a convex search space. L2 has a gradient that gets larger the bigger the distance, so dimensions with more error are effectively weighted higher, and weighted higher than just proportional to the amount of error.