CPU benchmark advice

Jump to navigation Jump to search
Revision as of 10 September 2011 at 15:42.
The highlighted comment was created in this revision.

CPU benchmark advice

Say, any of you Robocoders have a fast quad-core machine (like Core i5/i7 or comparable) and feel like advising me? I'm considering buying a Core i7 (2600k) quad-core box that would mainly (for now) be for Robocode. But I'm wondering how much of a speed increase this will offer me.

  • How long does a minimized 35-round battle of Diamond vs itself take? (Maybe run one then "Restart", if that helps JIT things up...) I'd need to know the Diamond version, Robocode version, and what CPU you've got to make full sense of that info.
  • How much of a speed hit do you take per battle when running 4-threaded RoboResearch? Ie, if a given battle takes 60 seconds when you run single threaded, does it still take 60 seconds when you run 4 Robocodes, or how much of a hit does it take?

This would be a huge, geeky indulgence, so I'd love to get some idea what I'd be getting for my money if I actually pull the trigger. =) Thanks!

    Voidious20:47, 9 September 2011

    I have a much AMD Phenom II x4 @ 3.6 ghz. My AMD is considered slower then say a higher end i7.

    You can see here for a ALU comparison. http://www.tomshardware.com/charts/desktop-cpu-charts-2010/ALU-Performance-SiSoftware-Sandra-2010-Pro-ALU,2408.html

    Mine is closest to the AMD Phenom II X4 975 Black Edition on this chart (Overclocked 965 to 3.6). Since Robocode is math heavy you can see the result each chip gets.

    For a real performance reference see the amount of rumble I can perform an a given period (4 clients).


    On this chart, the 2600K gets over twice the score of my CPU. 114.30 vs 55.0.

      Chase-san22:08, 9 September 2011
       

      Well, pretty sure I've done 100k battles in a month, so this tells me it's 2.3x as fast. I probably wouldn't shell out $700 for double the Robocode power, but I'm guessing it's much more of a multiplier than that. Also, I reckon performance could scale differently with simple bots (many of your rumble battles) vs high-end bots, which are surely much more memory-intensive, and thus perhaps not as much sped up by an increase in raw CPU power.

      So I'd still really love to know the time a certain battle takes and how close to linearly your Robocode power scales with # of cores...

        Voidious22:19, 9 September 2011
         

        I have done about 164359 battles so far, so about 41090 per client, 10 days in about so multiply that by 3 for a full month, only a total of about 123,270 per client. But that cpu is about twice mine in math, so estimate around 250,000 per client over a period of a month, so 2.5 times yours with a single client (even if you have more then 1 cpu, it only uses 1 cpu worth of cpu time). Times 4 for 4 clients equals about 1 million. This totals to about 10 times yours if you only run a single client, or 5 times with two.

        All things being equal. However to answer you're original question. I do not know the exact amount of time it takes, but it isn't very long. Also as long as I stick to only 4 threads, the speed is equivalent to running only 1 thread if my computer is doing nothing else.

        Because of intels hyperthreading, you may be able to get away with 5 or 6 threads without much overall hit.

          Chase-san22:36, 9 September 2011

          Well, thx for the info. That assumes Robocode power scales linearly with benchmark scores, which is something I don't trust, or I wouldn't even be asking this. :-) And comparing RR client battle count is a very rough estimate, too. (Maybe I've done 150k? Don't remember, and who knows what bots or if that was full time...)

          If anyone wants to serve up some cold hard battle times and single vs 4-thread comparisons, I'd still much appreciate it!

            Voidious23:13, 9 September 2011
             

            Just in the last week I got a AMD Phenom II X6 1090 at 3.2GHz here. Sure, it's slower per core than a high end i7 like the 2600K, but on the other hand 1) The CPU is practically half the price of a 2600K, and 2) six cores rather than four is nothing to sneeze at for robocode purposes.

            Running Diamond 1.6.7 versus itself, 35 rounds:

            • 50.265s average (Trials: 49.735s, 43.292s, 51.542s, 53.741s, 50.764s, 52.277s, 52.839s, 47.930s)
            This is without GUI, and including robocode startup time (about 1-2 sec)

            Running Diamond 1.6.7 versus itself, 35 rounds in 2 separate robocode instances:

            • 29.305s per battle
            • 58.611s per instance (Trials: 58.016s, 61.019s, 56.540s, 59.753s, 55.980s, 62.418s, 57.420s, 57.741s)
            Robocode startup time increased to 4 seconds. This would not be a factor in a battle runner which runs multiple battles in the same JVM!

            Running Diamond 1.6.7 versus itself, 35 rounds in 4 separate robocode instances:

            • 16.294s per battle
            • 65.174s per instance (Trials: 65.207s, 66.350s, 67.326s, 66.458s, 62.473s, 62.683s, 65.189s, 65.709s)
            Note, robocode startup was already seemed highly parallel, because robocode startup now took up to 8 seconds for one instance! As such, about 6 seconds of the increased time can be attributed to robocode startup.

            Running Diamond 1.6.7 versus itself, 35 rounds in 6 separate robocode instances:

            • 15.736s per battle
            • 94.417s per instance (Trials: 91.325s, 92.572s, 93.809s, 94.710s, 95.344s, 95.738s, 97.421s)
            The gains seem to flatten out about here. One note is, because one instance of robocode on it's own uses something like 115% of a core, I should reach a CPU limit at 5-ish instances, not 4-ish, so I suspect I'm hitting a memory bandwidth bottleneck
              Rednaxela23:15, 9 September 2011
               

              Oh yeah, that does remind me, I do have some pretty wicked memory in here. Tuned just so to get maximum speed out of it (which usualyl in most things I do effects overall feel of speed my computer has more then pure cpu power).

              If I recall my DDR3 is running at 1600 with timings 7-8-7-20.

                Chase-san23:43, 9 September 2011

                Quick little note to compare, DDR3 running at 1600 here too, but with 9-9-9-24 timings. Anyway, at 4 threads I don't suspect I'm hitting memory bandwidth bottlenecks, whereas it looks like I may be at 6 threads.

                  Rednaxela00:03, 10 September 2011
                   

                  The cores all share an L3 cache too... I wonder if it's worth the extra $100 to get 1600 RAM (and the mobo that supports it). I've been buying Macs for the last 5 years, I feel like such a noob examining this kinda thing again. =)

                    Voidious02:40, 10 September 2011

                    Extra $100 for 1600MHz RAM? My ram only cost $50 for 8GB, and I didn't see notably cheaper in slower ram really. As far as motherboard, mine was a little fancier than some others, but it was only about $115. So... It shouldn't cost $100 extra for 1600MHz ram.

                      Rednaxela02:47, 10 September 2011
                       

                      Yeah, I'm looking at barebones kits which default to a pretty cheapy motherboard, so most of that was to upgrade to a decent one. Prolly worth it anyway, and while it's not quite Apple-level gouging on memory itself, I guess it's universally true that I should buy/install my own. ;)

                        Voidious03:51, 10 September 2011
                         
                         

                        That's great info Rednaxela, thanks! Btw, how are you timing the battles so precisely, and measuring JVM startup time? I wasn't expecting 3 decimal places. =)

                        I'm getting 79s / battle single-threaded, 42s / battle with duel-threaded on my MacBook Pro (Core 2 Duo 2.8 GHz), just trusting the times output by RoboResarch. So it looks like you're almost 3x as fast, which is pretty darn close to the PassMark scores (6053 vs 2029). So maybe I can hope for 5x as fast with the 2600k after all, which would be fabulous!

                          Voidious02:07, 10 September 2011
                           

                          For timing I'm just running the *nix command "time ./robocode.sh -nodisplay -battle battles/diamond.battle". For Robocode startup time (including JVM startup but not just that), I'm just roughly estimating by watching the command line output.

                          Here's some fun... I tried using my motherboard's "automatic overclocking" functionality where it autonomously tries to see how high it can clock things, and it decided it could get it up from 3.2GHz to 4.2GHz (+30%). Both Windows and Linux booted fine, so initially I thought it was stable, and it ran one robocode battle at a time fine, but as soon as I tried to run multiple in parallel, the JVM kept crashing and it became apparent that the +30% overclock was not stable despite OSs booting fine. Interesting thing was, the +30% overclocking seemed fine thermally even with the stock cooler, it just had other stability issues.

                          I'm now running a more modest +12.5% CPU overclock, and I got the Diamond versus Diamond runs down to 12.837 seconds-per-battle, running 6 in parallel. This is still with memory running at 1600MHz, so I guess what I was hitting before wasn't purely a memory bottleneck anyway. Also, huh, 22.5% increase from a 12.5% CPU overclock...

                            Rednaxela03:01, 10 September 2011
                             

                            I ave a Intel 2600k(3.4Ghz, 4.4Ghz w/ Turbo Boost) and I did a quick benchmark for you. Using Diamond 1.6.8, no GUI, using Powershell to measure.

                            1 Instance, 35 Rounds:

                            - 48.1 seconds Total
                            

                            2 Instances, 35 Rounds:

                            - 47.28 seconds Total
                            - 24.17 seconds per Battle
                            

                            4 Instances, 35 Rounds:

                            - 1:05 Minutes Total
                            - 15.2 seconds per Battle
                            

                            8 Instances, 35 Rounds:

                            - 1:31 Minutes Total
                            - 11.37 seconds per Battle
                            

                            Though I noticed that Powershell had a small delay between creating each instance, not sure why. I haven't had a look at RoboResearch, so maybe I'll have a look at that later.

                            Not sure if it just my benchmark setup, but if Rednaxela could send over the benchmark setup, maybe I'll be able to test it in the same way.

                              Cuoq09:25, 10 September 2011
                               

                              Cool, thanks Cuoq! I wouldn't have expected hyperthreads to help so much to scale beyond 4 threads. So it looks like up to 4x as much Robocode throughput as I have now is a pretty solid estimate. Now I just need to grapple with whether that's worth a few hundred bucks... =)

                              I re-ran some tests here with Diamond 1.6.7 on a fresh Robocode 1.7.3.2, using the time command like Rednaxela. I'm seeing about 57s/battle when I run one at a time and 39s/battle when I run 2 in parallel (just duplicating the command, adding &, taking the max elapsed). I forgot my RoboResearch dir had a cranked CPU constant.

                                Voidious16:42, 10 September 2011