Reason behind using Manhattan distance

Jump to navigation Jump to search

You do not have permission to edit this page, for the following reasons:

  • The action you have requested is limited to users in the group: Users.
  • You must confirm your email address before editing pages. Please set and validate your email address through your user preferences.

You can view and copy the source of this page.

Return to Thread:Talk:DrussGT/Understanding DrussGT/Reason behind using Manhattan distance/reply (5).

Suppose there are 3 data points:

1 reference data point:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

And 2 data points in the database:

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] (Euclidean distance = 3.87, Squared Euclidean distance = 15, Manhattan distance = 15)

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4] (Euclidean distance = 4, Squared Euclidean distance = 16, Manhattan distance = 4)

If noise changes a single 0 into a 4, it will affect Euclidean distance 4x times higher than Manhattan distance. Euclidean distance will pick the first, Manhattan distance will pick the second.

MN (talk)17:51, 28 August 2018

this is a good demonstration! euclidean is sensitive to outliners and prefer the averagely non-bad one rather than some good point with some dimensions being noise.

Xor (talk)03:03, 29 August 2018
 

Shouldn´t you be adding that +1 to the x value before squaring?

Euclidean distance = sqrt( (x+1)^2 )

Manhattan distance = | x+1 |

MN (talk)18:10, 28 August 2018

my case is noise in another dimension ;)

however if noise is added to the main dimension,

it will be

sqrt((1 + x)^2 + 1)

vs

|1 + x | + 1

and if we put two curves together (shifted so that tey intersects on x=0)

http://robowiki.net/w/images/5/5a/C3BD3E15-EEB6-4F63-826F-7C1F5E54A78E.gif

euclidean looks terrible with large noise in one dimension, and manhattan looks robust.

Xor (talk)02:53, 29 August 2018