Difference between revisions of "User:Skilgannon/Free Code"

Revision as of 07:32, 9 April 2010

This is a neat method that I made up. It takes array of 'indexes' between 0 and max and returns a double between 0 and 1 of how 'clustered' your array is. 1 if all the values are the same, and 0 if there are infinite values spread perfectly evenly. Note, this is very different from a standard deviation calculation. In this code there can be as many 'dense' points on the graph as you want, and it won't try to accommodate them all from one mean. Instead, it relies on the fact that (d + 1)*(d + 1) is always greater than (d + 1) for any d > 0.

 
       public static double clustering(float[] indexes, float max){
      
         float[] sorted = new float[indexes.length];
         System.arraycopy(indexes,0,sorted,0,indexes.length);
         java.util.Arrays.sort(sorted);
         
         double clustering = sorted[0] + max 
            - sorted[sorted.length - 1] + 1;
         clustering *= clustering;
         
         for(int i = 1; i < sorted.length; i++){
            double diff = sorted[i] - sorted[i-1] + 1;
            clustering += diff*diff;
         }      
         return (clustering - sorted.length + 1)/((max + 1)*(max + 1));
      	
      }

Alternatively, if you know your step size is always greater than 1 (eg if you are using the indexes for logging to a set of bins) you can use the following code, which should be slightly quicker:

 
       public static double clustering(float[] indexes, float max){
         float[] sorted = new float[indexes.length];
         System.arraycopy(indexes,0,sorted,0,indexes.length);
         java.util.Arrays.sort(sorted);
         
         double clustering = sorted[0] + max 
            - sorted[sorted.length - 1];
         clustering *= clustering;
         
         for(int i = 1; i < sorted.length; i++){
            double diff = sorted[i] - sorted[i-1];
            clustering += diff*diff;
         }      
         return clustering/(max*max);
      	
      }

Difference between revisions of "User:Skilgannon/Free Code"

Revision as of 07:32, 9 April 2010

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools