Movement Challenge 2K6/Pre-Chat
- MC2K6 Sub-pages:
- Movement Challenge 2K6 - Reference Bots - How-To - Results - Fast Learning Results - Pre-Chat
Possible candidates:
- Curve Flattening Challenge 2K6
- CassiusClay 1.9.9.96bd
- Shadow 3.66
- FloodHT 0.9.2
- Anti-Pattern Matcher Challenge 2K6
- Che 1.2
- Wave Surfing Challenge 2K6
I noticed this info on The Spike Conundrum about DT 1.91:
The issue with the old DT here is that the stats were designed to cope with bullet flight times of up to 40 ticks (because in standard battles DT reduces bullet power to zero at a distance of 750 or more). With the larger space, and power 3 bullets this is easily exceeded and I end up firing guess factor 1.0 bullets at everything further away than 400 (because at 400 guess factor 1.0 does the job quite well).
I would gather that this means that any tank that tries to stay further than 440 pixels away is getting an unfair advantage in the MovementChallenge? (Dookious would be among them, as well as most high scorers I've tried in the CFC.) I had been thinking that perhaps a MovementChallenge2K6 was in order, and this kind of makes me think that even more so. Some of the TargetingChallenge results are out of date, but CassiusClay was the best GF gun there, I think; and who has the best PatternMatching gun at this point, Shadow? I may just start benchmarking my movement against them instead, in any case... -- Voidious
Yes, that's a problem with the CFC. CassiusClay should be a very good replacement as reference bot. About Shadow as an APMC reference, maybe it's better to continue using PatternMatcherBot since my gun is not a "pure" pattern matcher. -- ABC
Ah, OK. I ran a couple seasons CFC-style against CassiusClay, and the score was a fair amount lower, as expected. (It was like 55% vs high-60%'s against DT 1.91.) Is there really no better PM gun than Ender/PatternMatcherBot? It's fine with me to keep him, but it would surprise me that nobody has created a better PM gun in such a long time. Also, even if Shadow is not a "pure" pattern matcher but is a very good non-GF gun, it seems like a good candidate for a MovementChallenge... But that's just my 2 cents. -- Voidious
I believe there are many PM guns around better than Enders. But you are right, mine is probably the best non-GF gun in the rumble. I'm all for it, dodging CassiusClay + Shadow should be a tough challenge. :) -- ABC
Cool :) I guess it would be good to wait and hear what others have to say about it, but for now let's list CassiusClay and Shadow as the two reference bots. I'll make the MovementChallenge2K6 page, too. Also, ABC, I think we could use your input on the TargetingChallenge2K6 page. I have learned much from this Wiki in recent months, but I am not quite experienced enough yet to make 10 good suggestions for a new TargetingChallenge... And most of the top dogs of Robocode haven't posted much lately. -- Voidious
CC is probably the better choice, but what about using the latest DT? --wcsv
Hmm... DT does have a lower specialization index, which I think is a good thing to have in a reference bot. However, it looks like DT 3.02 saves data, which is not such a good thing for a reference bot. We'd probably have to specify that you delete the file before the match, unless Paul Evans is around to create a non-data-saving version... When I get a chance, I'll do some benchmarking vs each of them, as there may not be a big difference anyway. -- Voidious
DT doesn't save data (or use saved data) when in challenge/reference mode, 3.02 would make a good CFC. But I believe CC is the best GF gun around, together with Ascendant's. -- ABC
Well, I'd like to give DT 3.02 a try so as to compare it to my CassiusClay CFC scores. I would give my vote to whichever one is a tougher challenge. Also, I've been using Shadow v3.62 along with CassiusClay to benchmark my movement lately; Shadow is an absolute *beast* in a MovementChallenge! I was extremely pleased with a 33% score over 500 rounds yesterday :)
I'm currently playing around with Shadow's gun, version 3.61 is not my best gun. The problem is that I don't know which one of the "old" versions is the best... Anyway, I'll tweak it further and hope I have a good one ready before this challenge is started. -- ABC
Well, that's kind of a scary thought. Anyway, Shadow's gun is so killer against adaptive movement, I think it should be one of the guns for MovementChallenge2K6 whether it's another AntiPatternMatcherChallenge or just a nameless part of MC2K6. I will certainly be benchmarking against Shadow's gun, anyway. (I already am, actually.) -- Voidious
Shadow's gun is amazing, but IME benchmarking against Shadow is not so good for measuring your general rumble performance if you're using wave surfing... if you get a bot that performs really well against it, then that's a good sign you have a bug! :-) It's important to use a test set that reflects the 'average' Rumble bot IMHO. -- Jamougha
Hmm... I'm probably blissfully ignorant on this issue, but it seems like there has to be *some* value in trying to have good movement against an amazing gun! I'm not sure the MovementChallenge is really meant to reflect movement against an "average" rumble bot; I think most good WaveSurfers would get ridiculously high MC scores against a truly "average" targeting system. If you guys think that Shadow is a bad gun to benchmark against, though, we can just use a pure PatternMatcher bot, or even stick with the old PatternMatcherBot. I'll probably still test against Shadow until I've come to the same conclusion you did. :) -- Voidious
It certainly can be helpful to test against a bot like Shadow, but I usually mix in some more "average" guns as well. Although this may just be because I get sick of watching my bots get pummelled into the ground by Shadow all the time... --wcsv
Well, there is 'testing your performance' and there is 'going for the gold'. I would expect the published challenges to be a test against the best guns / movements, preferably representing a variety of approaches. i.e. best guess factor and pattern matcher guns, best anti guess factor and anti pattern matcher movements. You are certainly all right in wanting to test against the 'average' performers, but I agree with Voidious in terms of the challenges being a comparison to the best of the best. If you want to know about 'overall performance', we've already got that in the Rumble. -- Martin
The only problem is that in the Rumble it is difficult to determine what aspect of your bot (Gun or Movement) is boosting/hurting your ranking. The MC and TC are attempts to isolate and rank just one part of a bot. --wcsv
Well, GuessFactorTargeting is popular enough that we can agree that one bot should be a GF gun, right? I still haven't tried DT 3.02 myself, but I will be doing some comparisons with him and CC soon. I guess the main question is what the second tank should be? It seems to me that the ability to move well against Shadow would be valuable against a lot of other guns, but maybe I'm wrong. What other targeting system could we use as a good, all-around benchmark? SilverSurfer has a PM gun, and is open source... -- Voidious
Oh, and I wanted to mention that the basic WaveSurfingChallenge is a pretty good benchmark if you're not ready to handle the full MovementChallenge. (It was for me, anyway.) -- Voidious
It's been mentioned that a good score in the MovementChallenge doesn't guarantee strong results in the Rumble and bad scores don't guarantee poor results. Could we make the challenge more reflective of reality by adding more bots? I was thinking of a 10 bot challenge like the TC. Sure, it'd take longer to run; but we might catch issues we'd otherwise miss. --Corbos
I suppose that with bots you mean guntypes? Then you could have something like: HOT-gun, LT-gun, CT-gun, SymbolicPatternMatcher, 'real' PatternMatcher, two GF-guns and some more obscure types. Then you have the guarantee that the outcome reflects reality. The WaveSurfingChallenge would become obsolete in that case, the two referencebots could be re-used for this challenge. --GrubbmGait
I think that's a good idea. It seems the original MovementChallenge was completely geared towards the top dogs, but this would really help everyone. Another reference could be several simple targeters in a VirtualGun array, which I could put together pretty easily. -- Voidious
Ugluk's default gun assortment includes dead-on, linear projection, and circular projection, but you don't know which is hitting you without reading his file dump after the fight. If the guns are separated into individual bots you know your weakness to each, rather than your weakness to the strongest. I've actually got two sets of testing bots, all based on Ugluk's core code, but limited to specific features. The targeting bots use specific targeting methods and do not move. The movement bots use specific movement methods (usually paired with wall anti-gravity) and do not fire. Some targeters move and fire, but only because they are expecting their enemies movement to directly relate to theirs (i.e. mirror, tango). My question as the scope of this challenge broadens is if there is a desire/need for two challenges. I'll likely participate in the challenge on some level, but I am not in favor of handing my competitors free advice on how to kick Ugluk in the nuts. -- Martin / Ugluk
I think the advantage works both ways. If everyone's benchmarking against aspects of your tank, obviously they will improve on those aspects, but it also gives you test results you can use to improve your tank. I think somewhere around half of the guns should be of a higher level of complexity than simple targeting methods, whether they're PatternMatching, GuessFactor, VirtualGuns, or something else entirely. -- Voidious
I was also thinking that there is the option of enabling some of the reference bots to move, too; particularly the simple targeting methods. We could just use something like RandomMovementBot. It might give a much better estimate of how a bot's movement works in a real battle. Anyway, it's just a thought, I'm not really giving it my full support, I just wanted to mention it. -- Voidious
In the name of consistent (though artificial) movement isolation, I'd vote no movement from reference bots. My happy list of reference bots might include:
- HOT gun (WaveSurfingChallengeBotA)
- Linear gun (WaveSurfingChallengeBotB)
- Iterative Circular gun (Gruwel or a custom reference bot)
- Quick Targeting Virtual Gun Array (GrubbmGrb, Ugluk minus bot-specific targeting?)
- Expert Guess Factor gun (CassiusClay)
- Wicked Non-Guess Factor gun (Shadow)
- Dynamic Segment-er (Virus or Toad - both are a little slow but interesting)
- Neural gun (ScruchiPu)
- Tile Coding gun (Tigger)
- Additional Expert Guess Factor guns (Cyanide, DarkHallow, PowerHouse)
- An expert Pattern Matcher
We need to winnow it down to 10. Should we assign someone specific like the TargetingChallenge? --Corbos
No movement from reference bots is cool with me. Your list looks pretty good, and I have a few thoughts about it. First, the virtual gun array is something I could easily put together, and I actually threw one together for the TC2K6 benchmarks I ran. I like the dynamic segment-er idea, and they might not be so slow without movement; or, if they are, maybe they could be sped up a bit with little cost to their accuracy. (Toad isn't OpenSource, though.) I like the NeuralTargeting idea, too, although speed could be an issue there, as well.
If we're going to have a second GF gun, I think it would be cool to have one specifically tuned to beat WaveSurfers. I know Ascendant has one of two guns tuned for surfers, but it's not OpenSource, and I'm not sure if Mue is around to make us a MC version. Dookious 0.611 also has 2 anti-surfer guns. To my surprise, Dookious 0.611 significantly outscored both Shadow 3.64 and CassiusClay 1.9.9.96bd against my testbed of surfers; he scored best of the three on 7 of 11 surfers, and the final averages were: Dookious = 78.336, CC = 73.634, Shadow = 74.937. We could test some other guns, too, as I wouldn't be surprised to find that some of the other top-10 tanks are great against WaveSurfers; maybe PulsarMax or Cyanide.
Anyway, I'm not sure which targeter I'd vote off of your current list... The tile coding, while unique, may not end up much different than a regular GF gun; an expert pattern matcher and a superb non-GF may be very similar, but maybe not. As for putting someone in charge, I think that's a good idea. It really got me moving on the TC2K6, and I've actually really enjoyed taking a break from coding to look at various bots, and run lots of benchmarks. It's been a bit of a learning experience. Corbos or wcsv, would you be interested? I could do this one, too, if nobody else wants to, but I will probably take a few days off of the research to focus on Dookious first. :) -- Voidious
- I believe that Mue stated that Ascendant had a guess-factor gun and a pattern matcher gun, the latter being there for wave surfers. -- Martin
I'd be willing to throw the initial list together, run some tests, and upload the results. I'm making no guarantees in relation to Voidious's work on the TargetingChallenge2K6. ;) Objections? --Corbos
I certainly have no objection, that would be great. Let me know if there's anything I can do to help; I've still got a bunch of simple aimers from the early days of Dookious, and I can easily put them in a VG array of any combination, or make a tank using just one of them. I posted the one with HOT, Linear, Circular, and Linear Avg that I had already put together for some TC2K6 tests: <A HREF="http://www.dijitari.com/void/robocode/tc2k6/bots/voidious.simplevg.Dookious_0.611sVG.jar">Dookious 0.611sVG</A>. Good luck! :) -- Voidious
Two comments:
- The proposed gun types are great if you want to know against what types of guns your movement has it strengths and weaknesses. But that's not to say it gives you information on what type of guns you need to work with. You might want to have a reference bot set that better reflects the guns used in the RoboRumble@Home.
- Shadow's gun is very, very much like a GuessFactor gun. Meaning it could probably be rewritten using GFs while maintaining most of its characteristics. It's a good gun, no doubt. And it probably should be included in the challenge. Just not because it's not GF.
-- PEZ
Shadow's gun is not like a GF gun! I use a log for data gathering (you use stats tables), I analyse it by doing some DynamicClustering (you just do a table lookup), and I use PM-like path retracing for shot decision (you use the statistical mode). I have a GF version of it (only the shot decision uses GFs), very much like Ali, iirc, but I never managed to make it as good as the PM-inspired one. Welcome back, btw ;) -- ABC
Thanks! Well, I know your not using GFs and that your method is quite different if you look at the process. But if you look at what you achieve you could replace stuff with GFs. For instance that path retracing. You could make the history a list of arrays instead. Use the path as it is painted by the opponent in real time and then for each tick update each "alive" array with the applicable GF. Then instead of retracing you would do a table lookup and it would give you EXCACTLY the same results. I think what I'm really saying is that GuessFactors aren't a good way to describe a gun. GFs can be used for many different things. CC's gun is not best described by it using GFs, but that it uses statistical tabular data in several dimensions. GFs are only used to make it simple. Yeah, I should probably update CC's page, but it seems I am the only one making this distinction (I've tried make others see it before, if you remember) so it might only risk confusing people. If they insist on saying GFs == tabular statistics ... well, so be it. =) -- PEZ
You would have to change LOTS of wiki references... But yeah, I agree with you, I have made countless combinations of tab/log based GF/non-GF gun experiments. But, converting my gun to GFs like you describe does not produce the EXACT same results because my PM-like path retracing takes into account the distance of the enemy to compute the angle tolerance of a hit. It's kind of a "distance dependant bin size" thing. -- ABC
Findings and suggestions:
After comments from Martin and PEZ, I've decided against including simple targeting. We can include links to WaveSurfingChallenge bots A - C and leave them as an exercise outside the Challenge.
I'm not as hung up about representing every gun type. Ultimately, we're all trying to predict what happens next. Just because two bots use stats buffers doesn't mean they perform the same. I've aimed for interesting strengths and weaknesses versus different gun types.
It seems to make sense to score this Challenge as the inverse of the targeting challenge. That is: 100 - (reference_bot_bullet_score / rounds). It would relate the challenges, though scores would be lower.
I picked 10 bots from roughly the top 25% of the Rumble - with an emphasis on the top 10%:
- CassiusClay 1.9.9.96bd, ranked #2 when selected, uses stats buffers - simply an amazing, tough gun.
- Shadow 3.66, #3, uses dynamic clustering - scary and mean.
- Virus 0.6.1, #14, uses dynamic segmentation in a pattern matching scheme - performs slightly worse than expected when removed from RaikoMX but represents a novel aproach.
- SandboxDT 3.02, #16, stats buffers - intimidating. Probably the stronger part of DT when compared with its movement.
- Tigger 0.0.23, #25, tile coding - a problem bot for many high ranking bots. When I first recommended him for the TargetingChallenge2K6, I didn't realize his gun was so much stronger than his movement.
- Dookious 0.611, #32, stats buffers - this gun performs especially well against surfers.
- FloodHT 0.9.2, #35, stats buffers - out of development for a while but still hanging tough toward the top.
- GrubbmGait 1.1.3, #49, virtual gun array - an example of how a well-executed simple idea can be amazingly effective.
- Che 1.2, #93, pattern matcher - I had trouble getting other PM guns to perform well after a couple hundred rounds. I chose Che simply because I'm familiar with him.
- AHEB 0.6, #120, stats buffers - another problem bot for many in the top 25%. I like him. Anyone know more about him?
Results from tests against the 10 TargetingChallenge2K6 bots with inverse scoring:
500 Rounds | GrubbmGait | CassiusClay | Shadow | Dookious | Virus | Tigger | FloodHT | AHEB | SandboxDT 3.02 | Che | Total |
CassiusClay | 55.186 | 40.176 | 29.204 | 34.228 | 55.132 | 36.968 | 56.578 | 38.588 | 34.184 | 38.8 | 41.9044 |
Cyanide | 59.056 | 22.114 | 25.294 | 21.946 | 52.748 | 25.178 | 59.13 | 37.924 | 23.316 | 31.002 | 35.7708 |
Cigaret | 29.394 | 9.788 | 16.402 | 19.942 | 32.026 | 27.366 | 25.798 | 24.192 | 15.974 | 31.646 | 23.2528 |
Tigger | 34.792 | 18.868 | 12.16 | 16.922 | 25.994 | 14.142 | 32.588 | 16.592 | 11.12 | 19.296 | 20.2474 |
Chalk | 43.784 | 6.642 | 3.084 | 9.724 | 24.32 | 13.56 | 35.746 | 14.272 | 16.652 | 19.79 | 18.7574 |
FloodMini | 17.354 | 6.294 | 8.566 | 10.742 | 29.14 | 20.922 | 19.804 | 20.794 | 10.684 | 21.92 | 16.622 |
GrubbmGait | 16.328 | 9.044 | 10.828 | 12.624 | 19.094 | 21.076 | 9.682 | 18.546 | 8.778 | 18.274 | 14.4274 |
Random | 12.452 | 7.694 | 11.68 | 13.022 | 26.366 | 15.85 | 12.082 | 14.528 | 5.73 | 17.742 | 13.7146 |
DuelistMicro | 10.092 | 2.97 | 4.584 | 9.666 | 9.67 | 14.724 | 10.198 | 13.034 | 6.99 | 12.146 | 9.4074 |
Butterfly | 18.986 | 0.008 | 0.41 | 0.486 | 5.284 | 0.192 | 2.244 | 3.672 | 2.6 | 1.58 | 3.5462 |
I realize three bots are in both the Movement and Targeting Challenges. I hope this doesn't bother anyone. They work well for both.
Anyhow, try them out and let me know what you think: http://scatterbright.com/robots/MC2K6/MC2K6ReferenceBots.zip
Wow, thanks Corbos, it looks like you put a lot of thought and work into this. I like your list, and am honored to be a part of it. I'll check out all the reference bots sometime soon, but just one comment on the scoring: I think a major concern in using this style of scoring is that a contender could actually be rewarded for hitting the walls, thus lowering the opponent's bullet damage. It's not an issue for many top tanks, but it is still relevant for many tanks, even some top-notch ones. That's really the only problem I see with it. -- Voidious
Thanks Corbos for your effort. It is quite a 'heavy' list, but if you want to be the best, test against the best. I've seen a problem with Tigger MC, it refuses to fire its last (power 1.0) bullet and I have even seen an energy-countdown one time from far more than 1.0 energy left. Could you look into this? -- GrubbmGait
It should be as simple as commenting out this line (and the ending brace for it):
if (firePower < getEnergy() + 0.1 && getOthers() > 0) {
It's near the end of onScannedRobot. I don't see anything else codewise to prevent it from firing. If it did die at a much higher energy level, maybe it was stopped for slowing down for some reason? -- Voidious
Thanks for the heads-up. I fixed the problem and it should be included in the reference bot download. Thanks to both of you for participating. Voidious, the 'hitting wall' problem in scoring makes perfect sense. If (your_score / reference_bot_score) works for everyone, we can stick with that. I was a little worried about the 'heaviness' of the reference bots; but in the end, the 'gold standard' should help us all improve. Thanks again. --Corbos
I still see a problem with Tigger MC, but I think it will not have influence on the score as every challenger will encounter it. Sometimes (10-20 out of 500) Tigger does not start firing at all, and the inactivitytimer will eventually kick in. This can quickly be checked by comparing the number of wins of Tigger against the number of losses of the challenger and vice versa, and you need a challenger that will never hit the wall (f.e. CC). The same effect, wins/losses not equal, can also happen when the challenger hits the wall occasionally when the referencebot is already disabled, but that is then just a flaw in its movement. -- GrubbmGait
I added a totalling row and standard deviations for both rows and columns. It looks like so:
500 Rounds | GrubbmGait | CassiusClay | Shadow | Dookious | Virus | Tigger | FloodHT | AHEB | SandboxDT 3.02 | Che | Total | Std dev | Total B |
CassiusClay | 55,2 | 40,2 | 29,2 | 34,2 | 55,1 | 37,0 | 56,6 | 38,6 | 34,2 | 38,8 | 41,9 | 10,0 | 34,7 |
Cyanide | 59,1 | 22,1 | 25,3 | 21,9 | 52,7 | 25,2 | 59,1 | 37,9 | 23,3 | 31,0 | 35,8 | 15,5 | 23,7 |
Cigaret | 29,4 | 9,8 | 16,4 | 19,9 | 32,0 | 27,4 | 25,8 | 24,2 | 16,0 | 31,6 | 23,3 | 7,5 | 13,1 |
Tigger | 34,8 | 18,9 | 12,2 | 16,9 | 26,0 | 14,1 | 32,6 | 16,6 | 11,1 | 19,3 | 20,2 | 8,2 | 15,55 |
Chalk | 43,8 | 6,6 | 3,1 | 9,7 | 24,3 | 13,6 | 35,7 | 14,3 | 16,7 | 19,8 | 18,8 | 12,8 | 4,85 |
FloodMini | 17,4 | 6,3 | 8,6 | 10,7 | 29,1 | 20,9 | 19,8 | 20,8 | 10,7 | 21,9 | 16,6 | 7,2 | 7,45 |
GrubbmGait | 16,3 | 9,0 | 10,8 | 12,6 | 19,1 | 21,1 | 9,7 | 18,5 | 8,8 | 18,3 | 14,4 | 4,7 | 9,9 |
Random | 12,5 | 7,7 | 11,7 | 13,0 | 26,4 | 15,9 | 12,1 | 14,5 | 5,7 | 17,7 | 13,7 | 5,7 | 9,7 |
DuelistMicro | 10,1 | 3,0 | 4,6 | 9,7 | 9,7 | 14,7 | 10,2 | 13,0 | 7,0 | 12,1 | 9,4 | 3,7 | 3,8 |
Butterfly | 19,0 | 0,0 | 0,4 | 0,5 | 5,3 | 0,2 | 2,2 | 3,7 | 2,6 | 1,6 | 3,5 | 5,7 | 0,2 |
Total | 29,7 | 12,4 | 12,2 | 14,9 | 28,0 | 19,0 | 26,4 | 20,2 | 13,6 | 21,2 | 19,8 | 12,3 | |
Std dev | 17,8 | 11,8 | 9,3 | 9,1 | 16,0 | 9,9 | 19,7 | 10,9 | 9,4 | 10,6 | 11,6 |
I think we have too many curve flattening challenge type of bots in the set. FloodHT isn't much of a challenge really. And DT can't do anything that CC or Shadow can't. I suggest we mimic the original MovementChallenge and build this challenge from a few key challenges. Maybe:
- CurveFlatteningChallenge2K6
- CassiusClay 1.9.9.96bd
- Shadow 3.67
- PatternMatcherChallenge2K6
- WaveSurfingChallenge
- SharkChallenge
- NeuralChallenge
- ScruchiPu
This would make for 7 bots which will take long enough to run really. The new CurveFlatteningChallenge2K6 will be the most important probably. If you can dodge both these guns then you have a killer movement until someone redefines targeting. When a true AntiSurferTargeting arrives then we can form an AntiSurferChallenge and add it to the movement challenge.
Just a suggestion. My main problem with the current set is the unecessary inclusion of FHT and DT.
-- PEZ
Sub-challenges are good. They'd let people focus on what interests them. Two already exist. CurveFlatteningChallenge2K6 shouldn't be too difficult to set up. If someone has time to set up the others, we're ready. --Corbos
I can probably setup the sub-challenge pages tonight; I count three existing ones, WaveSurfingChallenge, CurveFlatteningChallenge, and AntiPatternMatcherChallenge. (I guess maybe WSC doesn't count?) Do we have a final bot list, though, in light of PEZ's comments about the reference bots? I'm not sure what I think... I will probably benchmark against a couple, like CC and Shadow, but a bigger set might give a more objective "final score" for posting on the site. -- Voidious
I've altered the TCCalc script slightly so that if you access it as:
It'll calculate MovementChallenge2K6 scores instead. For the dev version of CC it gets:
Season | Name | Author | Type | AHEB 0.6aMC | CassiusClay 1.9996bdMC | Che 1.2MC | Dookious 0.611MC | FloodHT 0.9.2MC | GrubbmGrb 1.1.3MC | SandboxDT 3.02MC | Shadow 3.66MC | Tigger 0.0.23MC | Virus 0.6.1MC | Score | Comment |
1: | CassiusClay dev | PEZ | WS | 39.58 | 43.68 | 38.50 | 33.60 | 57.72 | 53.65 | 37.33 | 29.37 | 37.86 | 57.86 | 42.91 | 500 rounds |
2: | CassiusClay 1.9.9.96ag* | PEZ | WS | 42.67 | 43.13 | 38.24 | 35.96 | 59.43 | 54.94 | 34.19 | 30.92 | 37.39 | 60.30 | 43.72 | |
Average: | CassiusClay 1.9.9.96ag* | PEZ | WS | 41.13 | 43.40 | 38.37 | 34.78 | 58.57 | 54.30 | 35.76 | 30.14 | 37.63 | 59.08 | 43.32 |
I hope it works and that I haven't broken anything else. Please test.
-- PEZ
Well, I've been sitting here using the TCCalc as I work on my gun, so I can say for sure that you didn't break TCCalc. Testing MCCalc with a RoboLeague XML, though, I notice that this is using the inverse bullet damage to calculate the score, right? I think we want to stick with the "challenger score / reference bot score", since hitting walls would actually help a challenger get a good score. Should be a simple fix; thanks for setting that up!-- Voidious
I think the inverse bullet damage score is very good since it mirrors the TC score. Anyone wanting to create a killer movement need to make sure they don't hit the walls too much anyway. Let's reach a consensus on the issue and then I'll fix the script. -- PEZ
Hmm... I guess I agree with you, as I have previously thought that the existing MC scoring is kinda bizarre. Perhaps we could even come up with a code snippet for tracking your wall damage, making it easy to subtract it from your score? Either scoring is fine with me, really. -- Voidious
I don't know what to think about this scoring method. Sounds logical, but unfortunatly Shadow still hits the walls sometimes and I wouldn't want to have its score inflated if I tweak it to hit them more often... -- ABC
But doesn't the old scoring method too end up in that the score might go up if you hit the walls? Since hitting the walls is denying the reference bot some bullet damage points. Inverse TC scores makes most sense I think. It's good with 100% being the limit too. -- PEZ
I think this challenge takes too much time to run. It would be sweet if we could have fewer bots. -- PEZ
I think the old scoring doesn't reward hitting walls because you are likely to lose a round here and there if you hit the walls too much, and that more than makes up for a little bit less bullet damage in the scores. PEZ, don't you keep careful track of all energy consumption in CassiusClay? Could you put together a few on***Event methods that would keep track of total wall damage throughout a match, and print it each round? It would prevent us from running it for other people in bots like PulsarMax, but it would let us get exact scores in this scoring system. Edit: Actually, we could just put it in the reference bot, I guess! -- Voidious
Well, I don't mean to volunteer your time... I can take a crack at it if you want. But you do do that in CC, right? -- Voidious
Sorr, but I don't see how me printing damage taken (yes, I think maybe I keep track on that) will help us get exact scores? --PEZ
Ah, yeah, I guess it would be a major pain in the butt to factor that in. I was thinking it could just be added onto the bullet damage manually at the end, but I'm just not thinkin' straight tonight :-\ -- Voidious
Since the movement challenge is about good movement, I'd count damage done to yourself the same as bullet damage, so add 'total damage from wall / robot collisions' to opponent's bullet damage. Another approach would be to reduce the total damage potential by the same amount before finding the percentage. I doubt that it is a significant factor in any competitive movement. -- Martin
That's not easy to automate (like with the MCCalc), as it's not listed anywhere in the numbers reported after a Robocode match. -- Voidious
Regarding inverse scoring, if someone wants to spoof the challenge at the expense of the rumble, it seems silly. If Shadow hits the walls, is he exploiting the challenge? I don't think so. Inverse scoring seems intuitive. As for too many bots, we can solve that by creating sub-challenges. Toward PEZ's comment - let's winnow things down to a workable set. Cheers. --Corbos
If I were to choose 3, they'd be from CassiusClay, Shadow, and SandboxDT. (Who doesn't daydream of beating SandboxDT at some point?) If Dookious and Tigger represent unique challenges (I haven't really checked), they'd be my next 2. My priorities are difficulty, uniqueness, and developer presence. It's hard to narrow the scope without slighting anyone, but there's my picks, for what they're worth. -- Martin
The only thing really unique about Dookious's gun is that he's relatively good against WaveSurfers, but he's not better than Shadow against surfers, anyway. -- Voidious
How about CC, Shadow and Dookious6.11 for the CurveFlatteningChallenge2K6? Sure, Tigger is cool, but what's the point without someone to explain the result? --Corbos
Works for me. -- Voidious
It seems unecessary to have three guns there. What we need is Shadow and a more regular stat gun. Preferably an open source gun so that we can learn how it works and how it segments data. I suggest CC's gun for this since it is effective and fast and many people have had at least a quick look at how it works. These two guns seems to have different sets of problem bots which is good too. I'm not too happy with the idea that people start tuning their movement against my gun of course. =) But with Shadow side by side in the challenge that is compensated. Dookius has a good gun, no doubt. But it's almost as slow as Shadow's. I say keep it to two guns and make sure Shadow's gun is one of them. It's slow as hell, yes. But it's awesome and it checks for failed flatness along unique dimensions. If you can flatten both Shadow's and CC's stats, then you have THE flattener. -- PEZ
Yeah, I'm not sure Dookious adds anything unique to the challenge. CassiusClay and Shadow are a darn tough benchmark as it is. -- Voidious
I striked the SharkChallenge as it is more a challenge on targeting than on movement. As already stated on that page, WaveSurfing would approximately give the same result as being a SittingDuck. -- GrubbmGait
I added a column, "Total B", to the end of the big table above, the one with the scores for the TC2K6 bots; it's the score against only Shadow and CassiusClay. Comparing it to the total across all 10, Chalk is the biggest anomaly, but Cigaret and FloodMini also break the descending sort order of the first total. -- Voidious
I'm not sure what the solution to this is, but I have noticed a bit of an anomaly with the MovementChallenge: in the MC, staying "far away" is generally a very desirable thing; but in a real match, it has pros and cons. Tweaking Dookious to stay much closer to enemies has a big effect on how badly he beats lower ranked bots (more bullet damage), and has little to no effect against more equal bots. In a real match, staying closer has some positive effects (better accuracy), while in an MC it really has none. Maybe we could make the MC on a smaller battlefield? Or we could "allow" custom (different than "battle mode") distancing in the MovementChallenge? I guess a bot with DynamicDistancing wouldn't fall victim to this problem, but I'm not going to implement that just to get "correct" MC scores :) Thoughts? -- Voidious
I think the best is to allow the challengers to do whatever they want with this. Running it on a smaller field would make sense only for the bots who prefer close fights. Like for CassiusClay it prefers closer fights against weaker guns because that is when it can control its surfing best. Otherwise it should fight from far away and trust its gun being better than the opponents. In general that is. Up against Shadow and Ascendant CC is toast regardless of fighting distance. -- PEZ
I started classes again this past week, and have been enjoying getting back to work on Dookious a bit, but I'm down to put some time into this again. I like the list we have at the top. I guess we need to decide on a PatternMatching bot to finish that list? How does Che match up vs the current PatternMatcherBot that we were using? It seems to have performed as well as some very good guns in the tests Corbos ran. Also, I think there's a bit of a void between the WaveSurfingChallenge and the CFC/APMC; something between beginner and expert targeting. Will the NeuralChallenge fill that void, maybe? Or could we combine a NeuralTargeting bot with like a VG array of simple targeters for a "Novice Challenge"? -- Voidious
How about we scrap the neural challenge, since only one or two bots use it in the rumble anyway. Let's pick Che for the PMC and lets throw FloodHT back in the CFC to fill some of the void. Creating a new bot for this just risks us introducing bugs and stuff that won't help us improve our movements. -- PEZ
You have my vote, both Che and FloodHT seem a good choice. How would the 'final MC score' be calculated, I think the CFC should have some more weight than APMC and WSC, maybe the ratio 2:1:1 would be appropriate. -- GrubbmGait
Weighting is a tricky thing. So tricky that I'd say let's skip it and go for a straight average. Since it pays off in trhe RR@H to be able to dodge the simpler targeting techniques it isn't too bad that this challenge rewards that too. -- PEZ
So the list at the top is correct now? And the MC will now include CFC, APMC, and WaveSurfingChallenge in the final score? I can setup the subpages tonight if that's OK with everyone. I take it you guys don't feel a "Novice Challenge" is necessary? :) I guess the WSC fills that spot for the most part. -- Voidious
Yeah. Let's settle for that list. -- PEZ
Bah, I've gotten carried away working on Dookious tonight, and I'm busy tomorrow night. But I'll setup the subpages Thursday or Friday if nobody else has. -- Voidious
I am the King of this Challenge! (Until the second entry is made) -- GrubbmGait
Thanks for setting up the sub-pages, Grubbm. And congrats on your top-ranked movement, that's quite impressive :-) -- Voidious
I put Dookious 0.72's scores up there. Again, I have opted to use custom distancing in MC mode in 0.72, since the battle distancing is far too close to give meaningful results in any kind of movement challenge. Some tanks stay that far away in battle, anyway, so I think it's still a fair measurement of Dooki's overall movement. -- Voidious
It might bite you dude. I use the challenge set-up for getting some kind of idea if a change has improved my movement or not. Then it is important that my bot does not behave in a special way in the challenge. Then again, that might not be too important since in real battle opponents tend to move. And in any case CC always tries to stay far away from the opponent, unless it is some simple targeter where CC does better a bit closer since there are fewer bullets in the air then.
About the challenge set-up. I think we should change it so that the main challenge displays all bot scores. We can make a javascript or something that can split the main challenge results up in subchallenge totals. For this all sub challenges should be 500 rounds I guess.
-- PEZ
Yeah, I may reconsider it at some point. But for a long time, Dookious really did stay that far away... Changing him to stay much closer was great for his RR rating, but it was pointless to compare my MC scores at the new distancing with the ones I'd gathered with the old distancing. Also, I'm not sure what you mean about the challenge display/set-up, but maybe I will in the morning =) (The "real" morning, g'night!) -- Voidious
About distancing. As your movement (surfing) gets closer to perfection you might notice that your MC scores against simple targeters, and I mean really simple, increases with closer distancing. For me a distance of 410 works.
The MC set-up. I want to see the detailed scores in the MovementChallenge2K6/Results table. Simple as that. =)
-- PEZ
You mean something like below? -- GrubbmGait
Bot Name | Author | Type | WSCA | WSCB | WSCC | TWSC | APMC | CFCA | CFCB | CFCC | TCFC | Overall Score |
CC 1.9.9.999 | PEZ | WS | 99.00 | 98.00 | 97.00 | 98.00 | 30.00 | 30.00 | 40.00 | 50.00 | 40.00 | 56.00 |
Yes, something like that. Especially that WSCC score. =) But the real bot names is clearer I think. I will keep forgetting what bot CFCC is. -- PEZ
Links added for clarity (optional solution). -- Martin
Yes, the more links the merrier. -- PEZ
Hmm...
Bot Name | Author | Type | WSCBotA | WSCBotB | WSCBotC | WSC | APMC | Shadow | CassiusClay | FloodHT | CFC | Overall Score |
CC 1.9.9.999 | PEZ | WS | 99.00 | 98.00 | 97.00 | 98.00 | 30.00 | 30.00 | 40.00 | 50.00 | 40.00 | 56.00 |
What about that? I am trying size="-1" there, as it is a pretty wide table without it. -- Voidious
I have tweaked CCs (2eta) surfing segmentation to faster learn simpler guns. It seems I almost destroy the performance it has against good guns though. Not good. But I just might try this version in the rumble just to see if it collects more or less rumble rating points this way. -- PEZ
I think the updated format is clear enough. But I suggest we wait with switching until we have some tools to make it really easy to post the results in that format. I think we should have
- a full-MC2K6 RoboLeague template
- a full zip of reference bots.
- a javascript to calculate the subchallenge scores and format the output for easy pasting
- a consensus that subchallenges are run over the same amount of rounds. 500 or 1000.
- I suggest 500 to make it less of a challenge for your patience to get a full result.
-- PEZ
500 is fine with me. They also need to be the same amount of rounds to work as a single RoboLeague template, of course. -- Voidious
Yes, that's why the consensus is needed. =) -- PEZ
500 is ok, if you want you can always run more seasons. -- GrubbmGait
You can just expect me to be "Captain Obvious" when I post first thing in the morning, OK? =) In fact, it might give us more clear cut results for the WaveSurfingChallenge2K6, as there will be larger point differences for tanks that take longer to learn the simple targeters. -- Voidious
You do run the WSC2K6 over 500 rounds right? It already demands that quite explicitly. =) -- PEZ
Ok, now you're just messing with me:
Run 1000 rounds against each of them.
? -- Voidious
Ummm, yeah... Oh well, I have been wanting to change that to 500 rounds for a while now and I guessed I tricked myself I had succeeded. =) Yes, the WSC will discriminate better and ask for you learning a bit faster with 500 rounds.
I think I know how to write something that helps us reformat the MCCalc results. Give me a try and then I'll let you know if I need help with that. Now, where's my Javascript book again. =)
-- PEZ
It gets quite messy when I try to add some javascript to reformat and stuff. Someone else has an idea on how to do this as easy as possible for us? -- PEZ
It would be very easy to re-format it in Perl. We could put a checkbox on the MCCalc that says "format for MC2K6", maybe? -- Voidious
And feel free to e-mail MCCalc to me if you want me to do the formatting in Perl. I may not get to it tonight, as I will be out, but I should be able to do it tomorrow. -- Voidious
Are the scores against CC, Shadow, and FloodHT separate from the CFC and APMC? -- Alcatraz A page with instructions to run MC2K6 would be great. -- Alcatraz
Those three bots are the CFC. The three subchallenges WSC, APMC and CFC are rated the same. The instructions for each subchallenge are on the subsequent pages. An overall instruction will be on the MovementChallenge2K6 root page as soon as a new version of MCCalc and a complete zip of the referencebots are available. -- GrubbmGait
Dudes, thanks to Void the MCCalc script now can do the reformatting of the output for this challenge. -- PEZ
I put together a .zip with all the reference bots and an XML template using 500 rounds: <A HREF="http://www.dijitari.com/void/robocode/mc2k6_reference_bots.zip">mc2k6_reference_bots.zip</A> (130 kb) -- Voidious
So it's agreed that we are changing the WaveSurfingChallenge2K6 to 500 rounds? We should mark all the current scores (like with an asterisk) to indicate that they were recorded using 1,000 round battles. -- Voidious
Hm, just looking at the bold values in the sub challenges it seems to me that Shadows overall score should be much closer to CassiusClays. I get 58.79 as overall score for Shadow. Maybe theres a problem with that MCCalc script (if ABC used it to calculate the score)? --mue
Yikes, good catch - I left the final average as just the average of all the scores, instead of the average of the sub-challenge totals. I've e-mailed PEZ the corrected function for the script. -- Voidious
Isn't that giving your score against Che too much importance? I like the way it is now better, and not just because Shadow ends up in first place (CCs score would be 64.53). -- ABC
The idea, at least, is that movement against GuessFactor, PatternMatching, and simple targeters should be weighted equally. I guess I don't really have an opinion either way, personally. -- Voidious
Ok, I'll change it, no problem. -- ABC
I've updated the script now with Void's latest changes. It should calculate the scores correctly now. Please test. -- Voidious 19:44, 10 June 2009 (UTC)
Works ok, but my best rumble movement isn't the best in this challenge though. I'll run some more seasons tonight just to be sure. -- ABC
Hey, nobody ever updated the WaveSurfingChallenge2K6 page to say 500 rounds! Who all has been running 500, and who running 1000? =) I know I've been running 500, and I know David has been running 1000, but I don't know about the others. Should we still change it at this point? -- Voidious
I seem to have deleted it in my cleanup, but if anyone is wondering... the only bots that had a # of rounds in their comment fields were
Krabby2 1.9g | Krabb | WS | 96.74 | 97.95 | 91.67 | 95.46 | 1 season 1000 rounds |
GrubbmGrb 1.2.1* | GrubbmGait | S&G | 89.69 | 95.88 | 93.77 | 93.11 | 5 seasons 500 rounds |