Mahalanobis Distance

Jump to navigation Jump to search

I tried. Found 2 problems in using mahalanobis distance instead of euclidean distance.

First problem is finding a covariance matrix, which is the mahalanobis equivalent of a weight vector. I tried using sampled covariance matrixes, but they are not always valid. You need a positive semi-definite covariance matrix to use it in mahalanobis distance calculations. There are methods for finding a nearest positive semi-definite matrix from a sampled matrix, but the methods are too slow to be calculated online.

Second problem is optimizing KNN search. Methods like kd-trees only work with euclidean distance. I don´t know of any data structure designed for mahalanobis distances and which supports online learning at the same time.

The first problem can be addressed with genetic tuning to find a valid covariance matrix offline, but I didn´t try it yet. In theory it can be stronger than euclidean distance search due to higher amount of weights to tune.

The second problem is not a serious problem if the data set is not too big. I managed to make a bot work using only real waves (no tick waves), a sampled covariance matrix and simple linear KNN search without skipping turns.

The problem is the matrix was sometimes valid, sometimes not. It fallbacked to euclidean distance (using only the main diagonal) whenever the matrix became invalid. The bot performance decreased and I never uploaded it to the rumble. But again, genetic tuning with an offline covariance matrix may fix it.

MN21:53, 19 October 2012

You do not have permission to edit this page, for the following reasons:

  • The action you have requested is limited to users in the group: Users.
  • You must confirm your email address before editing pages. Please set and validate your email address through your user preferences.

You can view and copy the source of this page.

Return to Thread:User talk:AW/KNN/Mahalanobis Distance/reply (7).