User talk:Rednaxela/FastTrig
Sound good! But, will it skipped turns at first tick of the first round? » Nat | Talk » 10:41, 3 March 2009 (UTC)
It shouldn't. It will always be as fast or faster than standard trig functions at least so long as you do the following:
- 'inline' the index calculations like was said and used in the example testfor
- Run init() BEFORE the battle begins (i.e. in static initialization code of your bot class)
An example of that static initialization code is like this:
public class SomeBot extends AdvancedRobot { static { FastTrig.init(); } public void run() { // Not in here } {
The initialization code doesn't take long (0.0017 seconds on my computer at 2880 divisions) but just to be sure it doesn't hurt to place the initialization outside of where the bot is timed --Rednaxela 16:27, 3 March 2009 (UTC)
Contents
My take
Nice work. From what I remember Simonton did something similar. I'm not sure if he had any luck though. One thing I noticed, I'm not sure if it works properly for negative angles, you will be rounding the wrong way. Also, the most useful trig function would be atan2. --Skilgannon 20:43, 3 March 2009 (UTC)
Ahh yes, sine doesn't work right at all yet for negative numbers. Cosine does work correctly except for rounding. I'll fix this now. :) --Rednaxela 21:51, 3 March 2009 (UTC)
Bugs
I was going to change the User:Rednaxela/FastTrig to fix the bug that Skilgannon said, and another one that happens with large DIVISION numbers and saw your edit, I haven't tested the fixed version, but I think it is still buggy. I think it should be:
public static final double sin(double value) { return sineTable[(int)(((value/K + 0.5) % DIVISIONS + DIVISIONS)%DIVISIONS)]; } public static final double cos(double value) { return cosineTable[(int)(((value/K + 0.5) % DIVISIONS + DIVISIONS)%DIVISIONS)]; }
I'm testing it with 62832 DIVISIONS, and much bigger loops than yours and is much faster. Here is the test I used:
package ags.util; public class FastTrigTest { public static void main(String[] args) { FastTrig.init(); double v=0; long ms; int A = 10000; int B = 1000; double absolute = 0; for ( int i = 0; i < A; ++i ) for ( int j = 0; j < B; ++j ) { double angle = Math.random() * Math.PI * 200 - Math.PI * 100; absolute = Math.max(absolute, Math.abs(Math.sin(angle) - FastTrig.sin(angle))); absolute = Math.max(absolute, Math.abs(Math.cos(angle) - FastTrig.cos(angle))); } System.out.printf("Wrose error: %.12f\n", absolute); v=0; ms = -System.nanoTime(); for ( int i = 0; i < A; ++i ) for ( int j = 0; j < B; ++j ) v += FastTrig.sin(i * j * Math.PI - Math.PI / 3); ms += System.nanoTime(); System.out.printf("FastTrig time: %.2f seconds\n", ms / 1E9); v = 0; ms = -System.nanoTime(); for ( int i = 0; i < A; ++i ) for ( int j = 0; j < B; ++j ) v += Math.sin(i * j * Math.PI - Math.PI / 3); ms += System.nanoTime(); System.out.printf("Math time: %.2f seconds\n", ms/1E9); } }
It also checks the error, so it was easy to the bugs. But anyway, including this will be a priority for me now :). Good work. --zyx 22:15, 3 March 2009 (UTC)
Thanks! Well, my +40*DIVISIONS one DOES work unless the angle is extremely negative, to a magnitude you wouldn't generally expect to see, but your double-modulus solution is more robust. Now making an updated version that includes that increased robustness, and atan(), and maybe atan2(). Will update the page when that's ready. --Rednaxela 00:05, 4 March 2009 (UTC)
Also discovered speed was further increased by using value*K instead of value/K and adjusting the constant to compensate, and also that the cosine table is redundant. Using the sine table for cosine with a shifted index doesn't affect speed but decreases init time and memory use. --Rednaxela 00:12, 4 March 2009 (UTC)
Hi, I have some problem. I do create new function:
public static final double tan(double value) { return FastTrig.sin(value) / FastTrig.cos(value); }
... then I add following line to test ...
absolute = Math.max(absolute, Math.abs(Math.tan(angle) - FastTrig.tan(angle)));
... then, I run, result in: Worst error: Infinity! What wrong? » Nat | Talk » 03:47, 5 March 2009 (UTC)
Well Nat, at some angles cosine is equal to 0. Due to this you can't simply divide those. You should create a seperate table for tangent and use that and that would be faster anyway. --Rednaxela 04:11, 5 March 2009 (UTC)
Ahh, I forgot. I try your new way, too, but it still have a much of error, I don't understand why! » Nat | Talk » 04:38, 5 March 2009 (UTC)
What about the fact that tangent gives undefined at certain values? The error-checking code isn't designed the cope with that because (Undefined - Undefined) isn't equal to 0. --Rednaxela 04:49, 5 March 2009 (UTC)
Really? because I make all with high diff (over than 1) out, and all is number! Note that some calculator (like Google) return 1.6.....E16 for tan(PI / 2), and my machine return that, too! I don't know because trig function are java native method so it is platform dependent. » Nat | Talk » 05:06, 5 March 2009 (UTC)
Ahh. At PI / 2 it is undefined in actual math, HOWEVER it seems that beause computer can't quite perfetly represent PI/2 in floating point binary, it rounds to a number just SLIGHTLY smaller, which makes the result of the function simply become a very-large-number. Which come to think of it reminds me of why the error would be so high for tangent: Tangent by nature includes some very large values which mean error is naturally similarly larger. Really, "absolute error" shouldn't be what we measure, instead "angle error" is more relevant in reality with the kinds of geometry we'd be doing in robocode. --Rednaxela 05:16, 5 March 2009 (UTC)
OK, I understand. One more I just want to know, can we create an atan map? It seem that sin,cos and atan is most use trig function in robocode. » Nat | Talk » 06:17, 5 March 2009 (UTC)
I've pondered atan, but it's tricky because it doesn't repeat, and I'm not sure how much aproximation is permissiable near the edges --Rednaxela 06:53, 5 March 2009 (UTC)
Arc Cosine Implementation
I've made a modified version that also includes acos, because that's used in so many applications when the cos rule is used, eg non-iterative wall smoothing, non-iterative wall angle, and my linear play-it-forward function. Because the function is so steep at the edges I had to increase the number of bins substantially, but because it is non-repeating it executes extremely quickly. Just doing a linear sweep, I'm getting execution 80X faster than Math.acos(). My max error is around 0.001, and the average is somethingE-5.
public static final int acos_DIVISIONS = 131072;//uses 1 MB of memory public static final double acosK = DIVISIONS/2; public static final double[] acosTable = new double[acos_DIVISIONS + 1]; static { for(int i = 0; i < acos_DIVISIONS; i++){ acosTable[i] = Math.acos(i/acosK - 1); } } public static final double acos(double value){ return acosTable[(int)((value + 1)*acosK + 0.5)] ; }
--Skilgannon 21:22, 8 April 2009 (UTC) An improvement to acos:
public static final double acos(double value) { return acosTable[(int)(value*acosK + (acosK + 0.5))] ; }
This requires one less addition (and one less loading of a double literal) --Starrynte 01:10, 10 April 2009 (UTC)
Nice! I didn't see that! --Skilgannon 01:21, 10 April 2009 (UTC)
Ah! I just realized: shouldn't it be public static final double acosK = acos_DIVISIONS/2;
, and not public static final double acosK = DIVISIONS/2;
? Or is this some Performance Enhancing Bug :) --Starrynte 18:09, 10 April 2009 (UTC)
Arc Tangent Implementation
Here are some (hopefully) speed optimizations, mainly very minor ones, based on the code posted on main page. (However, tiny optimizations add up!)
public static final void init() { //for(int i=DIVISIONS-1;i>=0;i++) { (comparing i to 0 is faster but this is a very minor thing for (int i=0; i<DIVISIONS; i++) { sineTable[i] = Math.sin(i/K); //no need to assign "value" to a variable first } }
You may want to consider making DIVISIONS a power of two (maybe 8192), since then you can replace "% DIVISIONS" (modulo DIVISIONS) with "& (DIVISIONS - 1)" (bitwise AND with (DIVISIONS-1), which is the same thing as modulo DIVISIONS). Not sure if AND is necessarily faster, you may need to check.
Also in main(), "Wrost error" should be "Worst error"
Here is a atan2 method that I made based on what I read on the internet. The catch is the Math.sqrt, perhaps it would be possible to make a sqrt table. (Uses acos method posted by Skilgannon above.)
public static final double atan2(double x, double y){ //return asin(y / Math.sqrt(x*x + y*y)); //robocode angles are different, if i understand correctly you just need to swap usage of sines and cosines for robocode-style angles return acos(y / Math.sqrt(x*x + y*y)); }
Information from http://www.analyzemath.com/Calculators/Rect_Polar.html Please correct me if I'm wrong in how to use trig in robocode. --Starrynte 00:47, 9 April 2009 (UTC)
I don't really know, but it seem that we need to swap y with x, too. And that atan2 implementations, I think using native hava method is faster since it need to call sqrt, which is slow. » Nat | Talk » 01:02, 9 April 2009 (UTC)
I've done some tests, and the custom atan2 seems to be slightly faster, with a sacrifice in init time and memory (due to sqrt and acos tables):
FastTrig init time: 0.07575 seconds FastTrig.atan2(5000,5000) = 0.7852348917896085 Took 0.00003 seconds Math.atan2(5000,5000) = 0.7853981633974483 Took 0.00005 seconds
(The maximum values that can be used with this atan2 is (5000,5000) , since that's as far as I set the sqrt tables to go up to. Likewise, the maximum value for a sqrt is 50000000)
I will post code as soon as I confirm it is working
--Starrynte 16:59, 9 April 2009 (UTC)
public static final double atan2(double x, double y){ //return Math.asin(y / Math.sqrt(x*x + y*y)); //robocode angles are different, swap usage of sines and cosines return acos(y / Math.sqrt(x*x + y*y)) * (y>=0?1:-1); }
Basically same thing as above, except the sign thing (Math.signum(y) is slower than just (y>=0?1:-1)). It turns out that the FastTrig acos outweighs the time needed for Math.sqrt (where tables were slower or way too large for the accuracy needed). Compared it with using atan2 in robocode and it works. About 2 to 4 times faster than Math.atan2().
I estimate (haven't tested) the worst error is about 0.05 degrees, which, in the worst case (when you're shooting from a corner to the corner diagonally across it on a 5000 x 5000 field, a distance of about 7071 units) gives an error of 6 units. Compare that with a bot width of 36 units.
--Starrynte 17:45, 9 April 2009 (UTC)
Nice stuff Starrynte and Skilgannon! Hmm, the acos and atan2 error is slightly more I'd like considering how many divisions they use but still, not bad! I'll incorporate these things into the main version on the page some time soon. About that power-of-two optimization, I've considered that myself actually though one needs to keep in mind that "& (DIVISIONS - 1)" wouldn't behave the same as modulus for negative numbers --Rednaxela 18:44, 9 April 2009 (UTC)
- cos is an even function, so negative numbers should be fine for cos at least, because "& (DIVISIONS - 1)" is the equivalent of also taking the mod. --Skilgannon 01:21, 10 April 2009 (UTC)
- I blame my high school comp teacher for explaining that the negatives of 2s complement are just like the positives except with a negative bit at the front as 1 instead of 0. I took that to mean that the binary representation was exactly the same for x and -x except with that first bit 1 instead of 0 for -x. I guess not :-/ --Skilgannon 08:07, 10 April 2009 (UTC)
- OK. But couldn't you use it for the second modulus, when you know everything is going to be positive anyways? You just have to cast to int first. --Skilgannon 10:36, 10 April 2009 (UTC)
Here is another slightly more inaccurate version, but may be faster depending on the computer you're using (it's about 5% slower on my computer). It replaces the divide by square root with an inlined multiply by inverse square root (described here http://en.wikipedia.org/wiki/Fast_inverse_square_root)
public static final double atan2(double x, double y){ return acos(y * (1.5 - 0.5 * (x=x*x + y*y)* (x = Double.longBitsToDouble(0x5fe6ec85e7de30daL - (Double.doubleToLongBits(x)>>1)))*x)*x)*(y>=0?1:-1); }
--Starrynte 20:46, 9 April 2009 (UTC)
From atan2 arises atan (up to twice as fast), plus a (slight) optimization in both:
public static final double atan(double value){ //return (value>=0 ? acos(1 / value) : -acos(1 / value)); OK if you approximate "Math.sqrt(value*value + 1)" as "value" return (value>=0 ? acos(1 / Math.sqrt(value*value + 1)) : -acos(1 / Math.sqrt(value*value + 1))); } public static final double atan2(double x, double y){ return (y>=0 ? acos(y / Math.sqrt(x*x + y*y)) : -acos(y / Math.sqrt(x*x + y*y))); }
asin can also be calculated from atan, though not as fast as making a table for it:
Math.asin(x) = atan(x / Math.sqrt(1 - x*x));
Now I wonder if you could make a log table... --User:Starrynte 02:22, 10 April 2009 (UTC)
Just thinking, wouldn't it be:
public static final double atan2(double x, double y){ return (x>=0 ? acos(y / Math.sqrt(x*x + y*y)) : 2*Math.PI -acos(y / Math.sqrt(x*x + y*y))); }
because it is a 'negative' bearing if x is negative not y, and atan2 is meant to return between 0 and 2PI. --Skilgannon 10:10, 10 April 2009 (UTC)
FastTrig Re-create
Just re-create FastTrig class base on the old one and information above. Rename to fast math because square root too. Please correct it if I'm wrong.
public final class FastMath { public static final double PI = 3.1415926535897932384626433832795D; public static final double TWO_PI = 6.2831853071795864769252867665590D; public static final double HALF_PI = 1.5707963267948966192313216916398D; public static final double QUARTER_PI = 0.7853981633974483096156608458199D; public static final double THREE_OVER_TWO_PI = 4.7123889803846898576939650749193D; private static final int TRIG_DIVISIONS = 7200; private static final int TRIG_HIGH_DIVISIONS = 131072; private static final double K = TRIG_DIVISIONS / TWO_PI; private static final double ACOS_K = (TRIG_HIGH_DIVISIONS - 1)/ 2; private static final double TAN_K = TRIG_HIGH_DIVISIONS / PI; private static final double[] sineTable = new double[TRIG_DIVISIONS]; private static final double[] tanTable = new double[TRIG_HIGH_DIVISIONS]; private static final double[] acosTable = new double[TRIG_HIGH_DIVISIONS]; public static final void init() { for (int i = 0; i < TRIG_DIVISIONS; i++) { sineTable[i] = Math.sin(i/K); } for(int i = 0; i < TRIG_HIGH_DIVISIONS; i++){ tanTable[i] = Math.tan(i/TAN_K); acosTable[i] = Math.acos(i / ACOS_K - 1); } } public static final double sin(double value) { return sineTable[(int)(((value * K + 0.5) % TRIG_DIVISIONS + TRIG_DIVISIONS) % TRIG_DIVISIONS)]; } public static final double cos(double value) { return sineTable[(int)(((value * K + 0.5) % TRIG_DIVISIONS + 1.25 * TRIG_DIVISIONS) % TRIG_DIVISIONS)]; } public static final double tan(double value) { return tanTable[(int)(((value * TAN_K + 0.5) % TRIG_HIGH_DIVISIONS + TRIG_HIGH_DIVISIONS) % TRIG_HIGH_DIVISIONS)]; } public static final double asin(double value) { //return atan(x / Math.sqrt(1 - x*x)); return HALF_PI - acos(value); } public static final double acos(double value) { return acosTable[(int)(value*ACOS_K + (ACOS_K + 0.5))]; } public static final double atan(double value) { return (value >= 0 ? acos(1 / sqrt(value * value + 1)) : -acos(1 / sqrt(value * value + 1))); } public static final double atan2(double x, double y) { return (y >= 0 ? acos(y / sqrt(x*x + y*y)) : -acos(y / sqrt(x*x + y*y))); } public static final double sqrt(double x){ return Math.sqrt(x); //return x * (1.5d - 0.5*x* (x = Double.longBitsToDouble(0x5fe6ec85e7de30daL - (Double.doubleToLongBits(x)>>1) )) *x) * x; } }
I'm quite sure both acos and tan need a higher number of divisions, otherwise they get a lot of error due to the steepness of their functions. Your asin function looks right to me, although I cleaned it a bit by calling acos. I also changed the tan because it has a period of PI, not 2PI, so now it's more accurate, although yours did work. I think a lookup table with a single Newton's iteration might be faster and more accurate for sqrt, but that's just speculation. --Skilgannon 10:01, 10 April 2009 (UTC)
- Thanks. The reason for not calling acos is to redice overhead, although it is a little. I don't really know about sqrt, just copy over from above. Anyway, my code is not yet test in anyway. » Nat | Talk » 11:36, 10 April 2009 (UTC)
- Math.sqrt(value) is probably faster, since the divide is expensive. If you can find a quick way of doing Double.longBitsToDouble and Double.doubleToLongBits, then it may be worth considering doing it that way (which, by the way, I edited slightly)
By the way, expect a slightly faster sine and cosine soon *without* using any tables at all! (Once I find the source of robocode's normalRelativeAngle) --Starrynte 17:34, 10 April 2009 (UTC)
- OK, if you're fine with the worst error being 0.001090292603 instead of 0.000436313959, this version of sine is 20-30% faster AND uses no tables at all (cosine coming up soon)
public static final double sin(double value) { //normalRelativeAngle =) value=(value %= (2*Math.PI)) < 0.0D ? value < (-Math.PI) ? value + (2*Math.PI) : value : value >= Math.PI ? value - (2*Math.PI) : value; //approximate sine through a parabola (more details at http://lab.polygonal.de/2007/07/18/fast-and-accurate-sinecosine-approximation/) return 0.225 * ( ( value*= (4/Math.PI) * (1 - abs(value)/Math.PI) ) * abs(value) - value) + value; }
--Starrynte 18:00, 10 April 2009 (UTC)