Anti-Surfer Challenge/Pre-Chat

From Robowiki
< Anti-Surfer Challenge
Revision as of 10:18, 3 August 2009 by Nat (talk | contribs) (→‎Voting: no changes at all)
Jump to navigation Jump to search

Pre-Chat

Current Data & Links

Scores for Reference Bot Candidates

Challenger SHA66 SHA83 RDC HOR CHK PHX CYA DGT CUN WM HYD CC PMX PEAR DIA DKI KOM YP Avg Seasons
Dookious 1.572cMC 65.71 71.31 91.85* 86.60 71.91 68.66 79.81 68.12 90.26 85.33 88.65 75.21 83.57 72.58 64.64 52.53 90.56 78.01 76.09 50, * = 48
Komarious 1.842 55.48 66.81 83.00 74.33 58.03 59.47 72.77 51.99 76.10 69.76 65.84 56.31 73.78 74.42 55.97 53.87 75.79 68.00 66.21 50
DrussGT 1.3.8 66.12 74.92 88.93 83.59 66.78 71.68 77.38 71.91 87.94 83.47 80.63 70.34 78.88 80.00 65.14 58.35 86.32 73.78 75.90 50
Diamond 1.22MC 59.70 67.05 86.57* 74.59 71.25 68.02 81.37 66.52 77.52 75.40 73.25 74.40 73.20 80.97 56.25 58.01 85.17 70.12 71.34 50, * = 43
SaphireEdge AS07aMC 62.16 67.48 92.25 87.85 68.82 71.41 79.50 70.70 91.00 86.10 91.55 73.55 86.17 74.24 58.12 59.83 90.35 76.91 77.11 50
Gaff 1.42MC 70.77 73.51 93.50* 87.62 71.37 69.52 80.76 67.59 90.94 87.36 86.98 74.59 86.41 75.97 70.45 57.11 90.38 78.86 77.66 50, * = 5
Average: 63.32 70.18 88.06 82.43 68.03 68.13 78.60 66.14 85.63 81.24 81.15 70.73 80.34 76.36 61.76 56.62 86.43 74.28 74.41
CPU time: 1.48 1.22 3.85 1.58 1.96 2.75 1.51 1.84 1.0 4.49 1.66 1.51 1.78 1.84 1.78 4.19 1.08 2.89 2.13

Reference Bot Abbreviations
Abbr. Name Abbr. Name Abbr. Name Abbr. Name Abbr. Name
SHA66 Shadow 3.66d SHA83 Shadow 3.83c RDC RougeDC Classic HOR Horizon 1.03 CHK Chalk 2.5.Al
PHX Phoenix 1.025 CYA Cyanide 1.90 DGT DrussGT 1.3.8f CUN CunobelinDC 0.1 WM Wintermute 0.6
HYD Hydra 0.21 CC CassiusClay 2pi.08 PMX PulsarMax 0.8.9 PEAR Pear 0.62.1 DIA Diamond 1.22
DKI Dookious 1.573c KOR Komarious 1.88 YP YersiniaPestis 1.3.7

Download Links

Initial Discussion

While it's not that useful for increasing RoboRumble performance, a lot of us enjoy trying to improve our Anti-Surfer Targeting. It's a big part of claiming the PL throne. We have surfers in the TC2K6 and TC2K7, but this would be an opportunity to update and expand the test bed. My initial thoughts:

  • I actually think the TC2K7 surfers are all good candidates. I'd definitely vote to keep Shadow (classic, awesome DC-surfer) and CassiusClay (classic, awesome VCS-surfer). Hydra (related to WaveSerpent) is still the top DC surfer.
  • In the present RoboRumble, 6 of the top 10 and 11-12 of the top 20 (not sure about RougeDC) are using DC surfing, so we should have more of those.
  • How many reference bots? 10? 12? 15? I say start at 10 and if there are specific bots we really want, consider increasing it.
  • Classic TC rules OK?
    • 35 rounds or 500 rounds? I like 500, myself, but 35 seems logical.

We could nominate, vote, then maybe discuss and have follow-up votes for tweaking the final set, to make sure we get the desired mix.

Anyone else interested? =)

--Voidious 15:16, 10 July 2009 (UTC)

I prefer 35 rounds. 10 bots should be enough. Surfers are slow, I don't want to let my computer run for 3 days to run 15 seasons of this challenge =) We should have 5 DC surfer and 5 VCS surfer. It would be good if we can have the not-yet-exist mixed surfer. I think we can ask Skilgannon to add the DC to DrussGT. I think we should have Engineer as a NN surfer too. » Nat | Talk » 15:30, 10 July 2009 (UTC)

Engineer is interesting, but it's not open source, and I'm not sure it would be a good benchmark because it's so unique. I'd personally rather have the real DrussGT than a DC version, but Wintermute seems a fine candidate. While I generally aim for fast-executing reference bots, in this case I'd vote for just getting the right reference bots and accepting the CPU cost. Benchmarking Anti-Surfer Targeting takes time, that's life. =) --Voidious 15:46, 10 July 2009 (UTC)
I nominate the both VCS and DC DrussGT, not just DC DrussGT, it should yield in an interesting yet slow wave surfing movement that should not be able to pair with any guns =) Actually I'm acceptable with the current-generation wave surfer, I just don't want to have 10 reference robots that are as slow as Cigaret. » Nat | Talk » 17:25, 10 July 2009 (UTC)

As a testbed I propose:

  • DC:
    • YersiniaPestis (very hard to hit, has multiple weighting schemes, flattener, also quite slow, I'm not sure about movement alone though)
    • Shadow (hard to hit, has flattener, runs fast)
    • Hydra (possibly vulnerable to antisurfer guns?)
    • Horizon or RougeDC (intermediate DC implementation)
    • CunobelinDC (very simple DC implementation)
  • VCS:

Perhaps cull out Hydra and Garm, but I think that covers things fairly well. Feedback? --Skilgannon 16:38, 10 July 2009 (UTC)

I prefer Horizon over RougeDC if we have Garm, otherwise I'd prefer RougeDC since I think we need at least one wave surfer with precise-intersection, though I'm not sure if both Garm and RoougeDC has it in movement. But perhaps we can have Wintermute instead of Garm, unless we don't want to have 3 robots from Skilgannon. But I myself think Phoenix is slower than Dookious, but I only test with the original version, not the movement only. I wonder if we can have a KDTree version of CunobelinDC? I think at least it will be faster, but I'm not sure if it will effect the score. » Nat | Talk » 17:25, 10 July 2009 (UTC)

(Edit conflict) Most of those bots I agree with 100%, though I still say we should vote in the end. Some responses:

  • I like RougeDC over Horizon because it's so much stronger in PL.
  • I always thought Komarious would be a perfect reference bot, but at this point I think she's just too weak. I like CunobelinDC since it's strong and probably a super-fast DC surfer, but my vote would be for that to be the weakest reference bot. Dooki's Main Gun way outperforms his Anti-Surfer Gun against Komarious.
  • I know Dookious is slow, and in the past I've always agreed that he therefore shouldn't be a reference bot. This time I'm not sure. The PL is dominated by 4-5 bots and he's one of them. I'll have to benchmark the speed, but I thought it was in the same ballpark as Shadow (~half as fast as CC).
  • We'd need a movement-only Phoenix, but I think David would provide that. (I may even have one already, I'll check.)
  • Diamond could also be worth considering, he's actually ahead of Hydra in PL (if barely =)).
  • Looks like Garm is not open source, dang. Maybe PulsarMax? I think he can do movement-only from the .properties file, we could repackage a TC version.
  • I personally think we should really push for the strongest bots possible in this. CPU-taxing or not, those are the movements you need to hit to have a strong Anti-Surfer gun. I'd love to have 9 of the strongest possible movements, then CunobelinDC as the only intermediate one. Just my 2 cents on that.

--Voidious 17:43, 10 July 2009 (UTC)

Please note that the current Robocode 1.7.3 isn't extracting the .properties file due the bug. Just FYI.

I wonder if we can have both Dookious and Phoenix since we don't have Komarious now (just figure out how bad she do in PL =)). I fully agree to Voidious' $0.02, we can have YersiniaPestis, Shadow, RougeDC, Hydra? I'd prefer Gauss here but I'm not sure if it from his gun or movement, and CunobelinDC for DC and DrussGT, Dookious, Phoenix, CassiusClay and another bot for VCS. I'd say it would be PulsarMax though, because Ascendent and WaveSerpent is too easy to hit. I'd say we are lack of good VCS surfer bot right now ;-) » Nat | Talk » 18:44, 10 July 2009 (UTC)

I think this is turning into more of a PL-challenge. If we want anti-surfer then we need to include more intermediate/basic surfing bots, possibly even a BasicSurfer, but bots that cover all lengths of the surfing spectrum. If we want a PL challenge we should probably just take the top 10 bots from the PL (maybe excluding Ascendant because his score is due to gun, not movement?).--Skilgannon 22:49, 10 July 2009 (UTC)

Hmm... What you say makes sense, of course. But I almost don't think of a bot like Komarious as requiring "Anti-Surfer Targeting" to hit -- normal (ie, non-decaying) learning guns hit her just fine. But maybe I'm being narrow-minded, and a truly effective anti-surfer technique should work well against her, too. Interested to hear some others weigh in.

On a different note, holy crap, Dooki's movement sure is slow. Here's the times on my system in 500 rounds against Komarious's gun:

  • Dookious 1.541 - 11:55
  • CassiusClay 2pi - 4:22
  • Shadow 3.66d - 5:50

Maybe we should leave him out after all...

--Voidious 23:21, 10 July 2009 (UTC)

Some interesting bots proposed above. Personally I don't care so much about execution speed, as long as the reference set is limited to 10 bots and battles last only 35 rounds. Or include some more basic (faster) surfers and expand the group to 15? I'd suggest increasing the number of seasons to 30-50, since even at 15 there's at least 0.5 percent variability. I'd also like to see Dookious in the set as that's a top surfer. The others should be as up-to-date as possible, so Shadow 3.83(c?) rather than 3.66 for example. Looking forward to it! --Darkcanuck 02:33, 11 July 2009 (UTC)

Is there any difference between Shadow 3.83 and Shadow 3.83c? Anyway, I think we should have Gauss in for a second GT surfer. I still want Dookious in, it learn faster than Phoenix. I usually win Phoenix and Shadow, but not Dookious in the first round. And I think Dookious probably only one surfer that use non-firing waves.

Do you think we should disable the flattener for the reference bot? » Nat | Talk » 04:31, 11 July 2009 (UTC)

I think flattener should be disabled, and that testing against flatteners should be a seperate challange or something. I says this because I consider the problem of hitten a flattener to be very different than the problem of hitting a normal surfer. --Rednaxela 17:58, 11 July 2009 (UTC)

While I agree it's a different problem (or sub-problem) in some respects, I'd vote to keep them on because I think it's an important part of hitting an advanced surfer. --Voidious 19:22, 11 July 2009 (UTC)

If we have it on, I'd suggest either always on, or always off, for any particular bot. Or perhaps one copy of the bot with it off and one copy with it always on. The reason I say this is that it will make the results more consistent. Otherwise, you can have the scenario where an improvement that allows you to hit a surfer better, may make it decide to turn it's flattener on and suddenly kill your score. While surfers ideally should always have the flattener on if it would help them, most have a bias towards disabling it in order to avoid lucky shots by weak bots triggering it. With the sudden jumps possible when bots enable flatteners at funny times, it could make scores too difficult to interpret, when your gun passes the boundry of the enemy deciding to use flattener on it. --Rednaxela 20:41, 11 July 2009 (UTC)

I agree it will create that possibility, I just can't get past the fact that we'd basically be ignoring a real problem faced in trying to target surfers. Optimizing against flattener or non-flattener Dookious (or even both, separately) may not increase your accuracy against the real Dookious. There is also the fact that we could not do this for Shadow or PulsarMax. But I'm cool with whatever is decided, maybe we can vote on this, too, if there's a lot of disagreement. --Voidious 21:21, 11 July 2009 (UTC)

(edit conflict) I'm a big advocate of letting the bots run as they would normally. To get good targeting performance against a surfer, you really need to consider their whole behaviour, which may include flatteners and the decision points used for turning them on and off. --Darkcanuck 21:25, 11 July 2009 (UTC)

The problem with the bots running as-is, is it may cause people to over-optimize towards making the gun just inaccurate enough against it to not trigger a bot's flattener. That kind of optimization will only really help against that very specific testbed bot, which leads to a misleading situation which I consider a very very bad thing. I really think the optimal solution would be for every bot with flattener, to include seperate battles for 1) Natural behavior, 2) Flattener always off, and 3) Flattener always on. Having all three numbers allows one to consider their whole behavior, while making the resulting scores more clearly indicate what's happening. Of course, this does increase how much time tests would take, but I'd consider it worthwhile. --Rednaxela 23:46, 11 July 2009 (UTC)

While I still think we should just use the natural versions, I do recognize that you're making some important points here, btw. Some thoughts in response:
  • If you're optimizing against one bot, you may indeed find your best score (for now) is to hit less and not trigger the flattener. But I don't see how you could do that against the whole test bed at once, so I don't think you could optimize your overall score that way with this challenge. (Even if you could, though, I still see that as the reality of targeting these bots.)
  • I quite agree, however, that one could gain valuable information from the flattener-only and non-flattener versions of reference bots. It would be cool to make them available for private testing even if we use the natural versions in the challenge.
  • Maybe we could even whip up some common code to implement in the open source bots that would log a few details on flattener use? Like when it was enabled/disabled, hit percentages for each, etc. It wouldn't be hard. Actually that seems like a cool idea for TC reference bots, in general, to implement a common logging API. After the TC runs, you could look at the .log for each bot to glean some info.
--Voidious 17:11, 12 July 2009 (UTC)

I believe this is Anti-Surfer Challenge, not Anti-Flattener challenge. If we want to test the accuracy of the anti-surfer gun, we shouldn't let the flattener on. While I'd agree that the normal advanced surfers will trigger it's flattener at some point, I think it should call 'Anti-Adaptive Movement Challenge' rather than 'Anti-Surfer Challenge'. » Nat | Talk » 02:53, 12 July 2009 (UTC)

Since when is using a flattener not "Wave Surfing"? You're still surfing waves, just using a different formula for deciding what's a dangerous point on the waves: where you've gone before instead of (or in addition to) where you've been hit. --Voidious 03:34, 12 July 2009 (UTC)
I know, they still surf waves. But it require difference targeting technique to aim them. I don't really think that the surfing flattener and random-based flattener have much difference. But the propose of current generation of Anti-Surfer targeting is design for hit-surfing, not visit-surfing, and most of surfer start with hit-surfing, I use term Wave Surfing with just hit-surfing. » Nat | Talk » 04:09, 12 July 2009 (UTC)
The trouble is, if you can't hit a surfer with the flattener enabled, it's not going to help your PL score. I suggest just leaving the bots how they are, otherwise we'll probably never get around to doing the challenge =) --Skilgannon 10:32, 12 July 2009 (UTC)

Is it time to move onto actual voting? Should I do tests with each nominated bot (timing 35 rounds, scores against a few guns, etc.) before actual voting? Do we need to discuss flattener vs no-flattener some more first? (I think having flattener-only/no-flattener versions of reference bots available for private testing could be really useful, if anyone missed that conversation branch above.) I confirmed that PulsarMax supports challenge mode from the .properties file, btw.

So here's an updated list of bots we could vote on, including some new additions from me. If anyone wants to nominate any other bots, feel free to add them below. I'm not sure we're all in agreement about the range of difficulties we want, but I believe that will work itself out with the voting.

DC VCS
  • Leaving out Gauss and WaveSerpent because I think they are extremely similar to YersiniaPestis and Hydra, respectively.
  • I think Pear is very slow, but if we are short on VCS bots, I know it is open source and a pretty hard-to-hit movement.
  • YersiniaPestis is really slow over 500 rounds (nearly an hour on my system), but it slows down as rounds go on, so I'll have to test 35 rounds specifically.

--Voidious 14:37, 13 July 2009 (UTC)

Nice table =) I think we should have Gauss as second GT surfing. Or are there any other GT surfers in the list? I think we should wait for a day or a couple of day before we start voting. » Nat | Talk » 16:17, 13 July 2009 (UTC)

I don't think Gauss is Go-To, the Gauss/VersionHistory says True Surfing. SilverSurfer is the only other Go-To bot I know of. I'm OK with adding SS to the nominated bots if you want (though I probably won't vote for him). --Voidious 18:44, 13 July 2009 (UTC)

Wow, there's been lot of activity here. About some things talked here:

  • Gauss is very similar to YersiniaPestis, it has some new features but it performs worse anyway, I called them both True Surfing because at the end of every tick the movement comes to a decision of either going CW or CCW around the next wave and not turning directly to some point.
  • About the flattener issue, I think the most realistic results come from the bots being in their normal behavior. On the other hand YersiniaPestis' flattener is not an on/off flattener, it has some weight, forcing it to 0 can simulate a flattener always off, but the normal and always on are not separable.

--zyx 20:27, 13 July 2009 (UTC)

Oh I'm confused. The GT Gauss is just in zyx's mind (please reconsider your user page, zyx) I don't think we need SilverSurfer, we need the newer surfer more =) For Flattener issue, I think the current solution is 'leave it alone', correct? » Nat | Talk » 12:26, 14 July 2009 (UTC)

I got a movement-only Phoenix from David - he called it 1.025TC because he couldn't guarantee it was from any particular version, but it's a recent Phoenix. I was next going to run some TC tests against the nominated bots, to see some scores and execution speed before voting, but my CPU time has been occupied with the User:Voidious/Robocode Version Tests instead. --Voidious 15:16, 21 July 2009 (UTC)

I believe I can help you run these tests, I just can't provide any reliable data on excecution speed because my computer somehow does not have a stable performance. All I need to know is what tests do you want and which robocode version to use --Navajo 17:40, 21 July 2009 (UTC)

I hate to make the new guy run tests, I'm sure you have a lot of stuff to explore. =) But I'll post some of the TC-mode bots when I get a chance, so anyone can run battles if they want. I definitely recommend RoboResearch for running batch battles, if you haven't checked it out yet. --Voidious 23:37, 21 July 2009 (UTC)
I do have a lot of stuff to explore, but right now nothing that uses much CPU, I've finished my robocode tests by now and I will probably not have any new tests to run until next month. By now I only use roboleague for this, all my attempts to run RoboResearch have failed, I don't know how to use svn, my only computer with linux is too slow and I don't know how to run or compile java in windows, I only use java to robocode, for everything else I use C.--Navajo 00:05, 22 July 2009 (UTC)
Cool, much appreciated. I posted a .zip with .class files of RoboResearch + my Melee support at Talk:RoboResearch#Melee, here: void_roboresearch_melee_01.zip. That would save you the SVN and the compiling if you want to try it. --Voidious 14:40, 22 July 2009 (UTC)
This link seems to be broken, but I've downloaded the zip file you made avaible in the Getting started instructions section of that page and for the first time I've gotten RoboResearching running. Thank you, this was very helpful, that feature of running 2 threads at the same time is great. Now I only need the TC-mode bots to help you run the tests. By the way, will it be 35 rounds? --Navajo 21:40, 22 July 2009 (UTC)
Yep, 35 rounds. 30-50 battles gives a decently accurate score for a given bot. Strange, that link works for me... dijitari.com resolves to 66.96.145.105 for me when I ping it, maybe it's a DNS issue? --Voidious 04:35, 23 July 2009 (UTC)

Pre-vote Testing

Ok, I put together a few of the challenge versions of the nominated bots. Sorry it's not all of them, but it's a start.

I know 3 of those guns are mine. :-P But DrussGT is the top DC gun; Dookious is close to the top VCS gun (SaphireEdge isn't available for download, right?); Diamond is a good DC gun and easy for me to whip up; and Komarious I already had, and it's a good general purpose but bad anti-surfer gun. I think DrussGT and Komarious are probably the most important to test with among those four. Any other thoughts on pre-voting tests you guys want to see?

--Voidious 04:35, 23 July 2009 (UTC)

Actually, since I don't have any Robocode Alpha tests lined up, I'll run some battles with DrussGT gun vs the above TC candidates overnight... --Voidious 05:19, 23 July 2009 (UTC)
Challenger Sha Chk Phx GT CC PMx Dia Dki YP Seasons
DrussGT 1.3.8 74.92 66.78 71.68 71.91 70.34 78.88 65.14 58.35 12.59 50.0

I must've screwed up the YersiniaPestis TC version, I'll check it out... (Might be time to organize this page with sections.) --Voidious 14:31, 23 July 2009 (UTC)

YersiniaPestis has a config file in it's data directory, if the test property is set to mc it will be movement only, and if set to tc is gun only with firepower 3. --zyx 14:55, 23 July 2009 (UTC)
Yeah, I set it and watched it working in "MC" mode. I guess Robocode didn't package the data dir? Shadow and PulsarMax I did by hand with zip/unzip... doh. I'll fix tomorrow if someone else doesn't, will be afk this evening. --Voidious 15:02, 23 July 2009 (UTC)
Is DrussGT not in the TC2K7? That's a a nice score vs Shadow... --Darkcanuck 15:37, 23 July 2009 (UTC)
Challenger SHA CHK DIA PHX GT CC PMX DKI Seasons
Komarious 1.842 66,81 58,03 55,97 59,47 51,99 56,31 73,78 53,87 50

RoboResearch didn't even run YersiniaPestis. About the times, every bot took between 1:00 and 1:30 minutes to run each battle, except for Dookious and Phoenix. Dookious took between 3:00 and 3:30 minutes and Phoenix between 5:00 and 6:00 minutes. I'm going to run battles with Diamond now. By the way, does RoboResearch save those results anywhere? If I have to stop running a test, can I start it againg later from where I stopped?--Navajo 20:12, 23 July 2009 (UTC)

Thanks much for running those battles! The results are stored in a SQL database, so yes, you can stop and start all you like. If you ever want to clear the database, use database_gui.sh (or in Windows, rename it .bat and replace : with ; within it, I think) and do "delete from robo_research.battles;" and "delete from robo_research.bots;" while the database is running (database_server.sh, which you may know already if you tried 2 threads). --Voidious 20:26, 23 July 2009 (UTC)
(edit conflict) I uploaded a YersiniaPestis movement only version and putted the link above in Voidious' list, try that one and let me know if you still have problems, I now that some versions of Robocode are picky when the properties file have a different robot version than the one on the file name. --zyx 20:28, 23 July 2009 (UTC)

Voidious: To reply to your note about SaphireEdge, no it's not currently uploaded anywhere but if you wish I could package it up and upload it if you're interested :) Also, while RougeDC was origionally a proposed challenge bot, I'd be against it being a challenge bot due to really not being proud of it's movement. Or then again... maybe it's hillarious flawed design would give samples of surfer weakness for anti-surfer guns to take advantage of? Haha, I don't know. --Rednaxela 23:36, 23 July 2009 (UTC)

Rednaxela did you modify an old version of this page, or did you remove the posts and changed the link intentionally? --zyx 23:43, 23 July 2009 (UTC)
Stupid accident. Was looking at an old diff and hit edit. --Rednaxela 23:45, 23 July 2009 (UTC)
Challenger SHA CHK DIA YP PHX GT CC PMX DKI Seasons
Dookious 1.572cMC 71,31 71,91 64,64 78,01 68,66 68,12 75,21 83,57 52,53 50,0
Time 1:38 2:27 2:24 3:27 6:03 2:18 2:02 2:20 4:00

YersiniaPestis ran perfectly. Those time are from the battles of the last 2 seasons. --Navajo 12:51, 24 July 2009 (UTC)

Phoenix is slower than Dookious. =( Why don't we use Firebird instead? (just joking =)) Shadow's the fastest??? I can't believe it! » Nat | Talk » 13:33, 24 July 2009 (UTC)

Doesn't surprise me at all that Shadow is the fastest. Though I never timed it, Shadow had always just seemed very fast in my experience. --Rednaxela 13:20, 24 July 2009 (UTC)

Well, I alway think that Druss is faster than Shadow. » Nat | Talk » 13:35, 24 July 2009 (UTC)

Thanks again for the rest results, Navajo! I'll try to put together TC versions of the other nominated bots today and run some more of the missing battles. Any other guns or tests you guys want to see?

@Rednaxela: Sure, I'd love to see its scores! But if you don't want SaphireEdge in the wild yet, I think Dooki's gun is an OK substitute for "super strong VCS gun". =) I figured having the very best guns plus a couple weaker guns (like Komarious) would let us see which bots are hardest to hit, and also which bots require some anti-surfer elements to hit better. For example, even if the top guns hit GT and CC well, Komarious' low scores indicate you need some anti-surfer element to hit them. But for PulsarMax, even Komarious hits him pretty well, so maybe he's not a good one.

--Voidious 15:52, 24 July 2009 (UTC)

Nah, I don't mind SaphireEdge being in the wild. Only reason it's not in the rumble is I don't have a movement that's decent enough in my own judgement. When I get home from work I'll upload SaphireEdge AS06 and start running some seasons with it against these reference bots. --Rednaxela 17:36, 24 July 2009 (UTC)

Ok, I made TC versions of the rest of the bots except RougeDC. Rednaxela, I'd respect your choice as to whether or not you want it considered, but I'd add that it may still be stronger than others that we are considering. Here's the new bot .jar's:

I feel like we should have that whole list of .jar's and a combined table of results somewhere, but I'm not sure where, so just leaving it for now.

--Voidious 19:43, 24 July 2009 (UTC)

Oh, and I'm running DrussGT vs the above bots and YersiniaPestis... --Voidious 19:51, 24 July 2009 (UTC)
Challenger SHA CHK DIA PHX GT CC PMX DKI Seasons
Diamond 1.22MC 67,05 71,25 56,25 68,02 66,52 74,40 73,20 58,01 50,0
Time 1:47 2:39 2:40 6:40 2:18 2:29 2:12 4:10

YersniaPestis is not there because when I started to run it yesterday I still didn't have the jar. I probably won't have any new results this week, I'm finishing the development of a new part of my still unnamed bot and will spend the weekend debugging and running tests on it, but I think (or at least I hope) this won't take long. --Navajo 20:03, 24 July 2009 (UTC)

@Voidious: Hmm, well, including RougeDC may be interesting even if I'm not terribly proud of it's movement. As I did note, it's flaws may be decent examples of flaws for anti-surfer guns to take advantage of. I'll also package that up as a reference bot when I get home from work. So I estimate I'll have SaphireEdge and RougeDC packages uploaded for here in... 3 to 4 hours from now. :) Also.. wow at some of those Shadow scores. Any idea why Dookious is scoring far better against Shadow in the above test, than it did in Targeting Challenge 2K7/Fast Learning Results? --Rednaxela 21:11, 24 July 2009 (UTC)

Well, they are different versions of both Shadow and Dookious. I don't think much changed in Dooki's gun between those versions, but maybe Shadow changed a bit... (Definitely different Robocode versions, too, though I hope that's not it.) --Voidious 21:55, 24 July 2009 (UTC)
It's probably because of different Shadow versions. 3.83a should be slightly stronger movement wise. Unfortunately I don't remember exactly what movement version I used in that one, I'll have to make some experiments in the rumble one of these days... --ABC 22:56, 24 July 2009 (UTC)
In that case, perhaps we should use a stronger Shadow version like 3.66d (the TC2K7 one) in this challenge? Or maybe both even? --Rednaxela 23:32, 24 July 2009 (UTC)
Yep, agreed. I say we try 3.83a and 3.66d, it's not like it takes that long to test them. --Voidious 00:22, 25 July 2009 (UTC)

Some more results, I will integrate into top table later. I'll kick off Komarious against these bots next.

Challenger Hor Cya Cun WM Hyd Pear Kom YP Sea
DrussGT 1.3.8 83.59 77.38 87.94 83.47 80.63 80.00 86.32 73.78 50

--Voidious 00:22, 25 July 2009 (UTC)

More Komarious scores. Now running Dookious...

Challenger Hor Cya Cun WM Hyd Pear Kom YP Sea
Komarious 1.842 74.33 72.77 76.10 69.76 65.84 74.42 75.79 68.00 50

--Voidious 03:39, 25 July 2009 (UTC)

Alright, RougeDC ClassicTC and SaphireEdge AS07aMC are uploaded (see links above). As a warning, RougeDC's movement is EXTREMELY slow in challenge mode, because it always tries to surf ALL waves at once, and with enemies shooting power 3.0 bullets... ouchie... Should I make another version that limits to surfing two waves? Also, you may notice that I never posted the SaphireEdge AS07a results before. I'm not sure why but I found it in my RoboResearch data and it's better with anti-surfer than AS06, though at the cost of a slight score reduction against random movers. I'm going to run some SaphireEdge AS07a seasons overnight tonight, so we'll see how that compares. --Rednaxela 05:04, 25 July 2009 (UTC)

Hm, the seasons so far as showing the following: 1) RougeDC is the MOST predictable surfer of the whole lot here!, 2) compared to other sample guns, SaphireEdge seems to score poorly against the strong movements (Shadow, Dooki, etc), but unusually exceptionally against the weaker ones (i.e. PulsarMax, Horizon) --Rednaxela 15:27, 25 July 2009 (UTC)

More Dookious scores, running Diamond now... I'll fill in the missing RougeDC scores after that.

Challenger Hor Cya Cun WM Hyd Pear Kom Sea
Dookious 1.573c 86.60 79.81 90.26 85.33 88.65 72.58 90.56 50

--Voidious 19:53, 25 July 2009 (UTC)

Here are some results for SaphireEdge AS07aMC. RougeDC was being a bit troublesome so I'm not up to 50 seasons with that yet. Interesting results I think.
removed big table due to it's data being shown above more efficently
--Rednaxela 06:59, 26 July 2009 (UTC)

Hm, interesting, out of everything on the table except RougeDC and the older shadow (as only SaphireEdge has those results yet), Komarious gets an average score of 65.83, DrussGT 75.69, Dookious 76.73, and SaphireEdge 77.10. I find it kind of funny how similar the average scores of those top 3 guns are, yet how different the bots they're strong/weak against are. --Rednaxela 14:33, 26 July 2009 (UTC)

More Diamond scores. Still missing RDC and SHA66 for now...

Challenger Hor Cya Cun WM Hyd Pear Kom YP Sea
Diamond 1.22 74.59 81.37 77.52 75.40 73.25 80.97 85.17 70.12 50

--Voidious 18:22, 26 July 2009 (UTC)

Gaff 1.42MC scores now added for comparison. It sure does a nice job against Diamond. --Rednaxela 00:16, 31 July 2009 (UTC)

I've run some seasons using 3 versions of my own in-development bot, one normal version, one with anti surfer gun only and one with no anti-surfer gun at all. Here are the results:

Challenger SHA HOR CHK CUN WIN HYD DIA YP PHX CYA GT CC PMX PEA DKI KOM Seasons
1.109 72,09 87,80 72,46 89,84 87,10 87,20 64,85 78,65 71,75 80,83 69,72 73,30 81,29 81,79 61,90 87,39 50,0 seasons
1.110AS 67,97 89,62 73,00 93,11 87,28 89,05 67,77 79,09 65,75 81,33 68,14 76,57 84,65 78,23 59,02 89,00 50,0 seasons
1.110nAS 71,64 79,63 70,08 85,10 81,04 76,40 60,50 75,24 71,21 78,96 69,63 71,25 77,51 83,10 60,93 82,82 50,0 seasons

The only thinf clear to me is that the fact that these bots enable their flatteners when their hit percentage is over a certain amount just makes it more difficult for us to know if there was any improvement in targeting. Of course that if you are thinking in improving PL score then you must account for this, but if the idea is to measure PL perfomance of guns I don't really think we should have bots like Komarious and CunobelinDC in the challenge for in the general rumble they are not excactly the greatest challenges one may find. --Navajo 03:19, 31 July 2009 (UTC)

Interesting results. So the averages are 78, 78.1, and 74.69, respectively - your AS gun comes out on top, barely ahead of the VG. I'd be interested to see how Dooki's guns do on their own, maybe I'll run them next. Why do you say it's clear that the flatteners cloud the results? I'm not surprised your non-AS gun does better against some bots, since some combinations of gun vs movement just match up differently than others... Also, wow, you'd have the top score against Dookious. Nice. =) --Voidious 03:51, 31 July 2009 (UTC)

Very nice Navago! I'm assuming you were testing against Shadow 3.83 and not 3.66? By the way, here are the averages for all bots tested on that subset of reference bots:

Komarious 1.842 Diamond 1.22MC Navago 1.110nAS DrussGT 1.3.8 Dookious 1.572cMC SaphireEdge AS07aMC Navago 1.109 Gaff 1.42MC Navago 1.110AS
65.83 72.07 74.69 75.69 76.73 77.10 78.00 78.09 78.10

I hope to see your bot in the rumble soon Navago :)

About bots for the Anti-Surfer Challenge, I believe that having at least once simpler reference bot like CunobelinDC or Komarious is very important, because this isn't about PL performance only, it's about strength against surfers. Scores against CunobelinDC for instance are fairly varied, therefore and those sorts of variations are useful information. What we should try to weed out are 1) redundant bots, which show similar patterns of what guns they are strong/weak against (like Horizon/CunobelinDC or Chalk/CC), 2) VERY slow bots like RougeDC Classic, and 3) Bots that don't represent the best we've seen from the bot (Get rid of Shadow 3.83, in favor or Shadow 3.66) --Rednaxela 04:25, 31 July 2009 (UTC)

I don't think my bot will be in the rumble so soon. For this I would need to finish the movement and to come up with a name (I can't release it as test1.xxx), and with my classes beggining soon I don't think this will happen this year. Anyway, I agree that we should try weed out RougeDC Classic and change Shadow3.83 for 3.66 (by the way, I used 3.83 in my tests), but I'm not sure about the others, I haven't analyzed it yet, but I agree with 1, we just need to identify these bots. Voidious, when running battles agains Phoenix, the non-AS version does better than the AS version, but this is because with the non-AS Phoenix rarely turs its flattener on, or does this too late. With the AS version on the other hand Phoenix turns its flattener before round 10, and the TC score falls from 80 to 65, and I do believe that this is probably what explains the score differences between these two versions against dookious and pear too, so that is why I belive that the flatteners may be clouding the results. --Navajo 05:00, 31 July 2009 (UTC)

Those are great scores! You should really release your bot, it doesn't have to be perfect and you can finish it up over the next week/month/year/decade. --Darkcanuck 15:13, 31 July 2009 (UTC)

Ah, I didn't realize you'd observed it specifically. Still, it's clearly a real problem for our guns and one that we've all basically just ignored until now. It may be possible to detect when a bot has started its flattener and adapt accordingly; figuring that out might really blow the roof off our scores in this challenge, and in the rumble!

Listing relative execution speed in that table would be useful, too. Navajo's the only one that's posted any of that info so far (thanks!). Maybe I'll run them all against Komarious (probably the fastest gun) and post the times. And I pretty much agree with all Rednaxela's points, except that I might still like both CC & Chalk because they're so strong.

So we can start voting soon? Or is there more to test / discuss? I would just like to have execution speed comparisons before making my votes...

--Voidious 14:43, 31 July 2009 (UTC)

Let's vote! I'm not so concerned about execution speed. Those average scores have my itching to test my latest version and now that the heatwave is subsiding I can finally start running RoboResearch again... We're selecting 10 reference bots, correct? I've narrowed my choices down to 11 (out of 18) already. --Darkcanuck 15:13, 31 July 2009 (UTC)

Yes, I think 10. Should we have two votes, VCS and DC, and each person votes for 5 in each table? Alternatively, we could just vote on 10 bots, and/or say "vote for however many you like". --Voidious 16:15, 31 July 2009 (UTC)
Which bots use which? It might be simpler to allow voting for 10 -- if folks want to split it 5 DC + 5 VCS then that will come out in the results. I'll add a voting table above -- feel free to correct it if you think another format is more appropriate. --Darkcanuck 04:33, 1 August 2009 (UTC)

The execution speed comparisons would be nice. I'm ready to vote as soon as others are ready. Also, here's a little analysis that may be relevant. The following is how the different bots rank in average scores in the above table: DKI DIA SHA66 DGT CHK PHX SHA83 CC YP PEAR CYA PMX HYD WM HOR CUN KOM RDC. I feel it makes sense to choose from a variety of areas of that list. Also, the following pairs of bots have the exact same ranking orders against the test guns: CHK/CC, DIA/SHA66, HYD/YP, HOR/CUN, WM/RDC. Some of those being the same might be flukes, but I think it does indicate that the weaknesses for guns to take advantage of may overlap in those pairs. --Rednaxela 15:28, 31 July 2009 (UTC)

Interesting observations, and thanks for the nifty sortable table. =) While I'm quite proud of how well Diamond does, it is eerily similar to Shadow's scores. If I decided to pick only one, it would probably be Shadow, but I am not sure how I'd vote yet. I think Chalk / CC is probably a fluke, being DC vs VCS and fairly different algorithms. Another interesting observation is that among the DC surfers, Diamond and Komarious are the worst guns against every reference bot except Chalk, while the scores against VCS surfers are much more varied. --Voidious 16:15, 31 July 2009 (UTC)

Ok, I've run 5 seasons of Komarious against the reference bots and averaged the times. I then normalized them against the fastest time, which was CunobelinDC at 20.6 seconds on my system. So he's 1.0, 2.0 would be twice as much time, etc. It's worth noting that RougeDC also gets stopped by RoboResearch fairly regularly, though it's not the slowest bot... --Voidious 02:13, 1 August 2009 (UTC)

As a side note, the bots I voted for happen to add up to exactly 20.00 in normalized CPU :) --Rednaxela 06:44, 1 August 2009 (UTC)

Voting

User SHA66 SHA83 RDC HOR CHK PHX CYA DGT CUN WM HYD CC PMX PEAR DIA DKI KOM YP Total Votes
Darkcanuck X X X X X X X X X X 10
Rednaxela X X X X X X X X X X 10
Nat X X X X X X X X X X 10
Skilgannon X X X X X X X X X X 10
Voidious X X X X X X X X X X 10
Navajo X X X X X X X X X X 10
Zyx X X X X X X X X X X 10

I think we have nice ASCII graphics here =) » Nat | Talk » 07:03, 1 August 2009 (UTC)

I think we already have all more active robocoders, shall we wait for these?

» Nat | Talk » 17:37, 1 August 2009 (UTC)

I don't have an opinion on this, but thanks for thinking of me. --Positive 20:19, 1 August 2009 (UTC)
Same as Positive--CrazyBassoonist 21:18, 1 August 2009 (UTC)

Also zyx, he and ABC are the only ones who have talked on this page but haven't voted. I don't mind waiting a bit, but if you guys are eager we can cut it short... 7-8 of them are pretty much locked already. --Voidious 19:29, 1 August 2009 (UTC)

Ouch! I forgot Zyx! Currently there are 7 bots that have 6, 1 bots have 5, 2 bots have 4 and 2 bots have 3 » Nat | Talk » 00:48, 2 August 2009 (UTC)
I count 6 with 6, 1 with 5, 1 with 4, and 2 with 3. =) --Voidious 01:41, 2 August 2009 (UTC)
Yeah, sorry. I was about to add the total score when I feel it isn't complete yet so I didn't save it. And next morning I post that and I trust my (wrong) memory =( » Nat | Talk » 02:10, 2 August 2009 (UTC)

Sorry for the late voting. --zyx 16:04, 2 August 2009 (UTC)

Hmm, the current voting results aren't ambiguious and do list 10 ones without any ties for the 10th and 11th spot and such. The total normalized cpu time of the currently winning set is 21.24, very close to the total averge. Assuming we're just waiting for ABC's vote, it looks like SHA66, CHK, GDT, CC, DIA, DKI and YP are completely locked in. Unless HOR gets another vote the other bots will be sure to be PHX, CUN, and PEAR. If HOR gets another vote and at least one of PHX/CUN/PEAR does not, then there will be a tie to break. (or we go with less than 10 bots?) --Rednaxela 16:25, 2 August 2009 (UTC)

(edit conflict: basically repeat what Rednaxela's said) Dunno if you guys want to wait for ABC, but I don't think he'll return shortly. If ABC vote on it, the only changes that can happen is to have Horizon instead of either Phoenix, CunDC or Pear. Current winners:

  1. Shadow 3.66d
  2. Chalk 2.5.Al
  3. DrussGT 1.3.8f
  4. CassiusClay 2pi.08
  5. Dookious 1.573c
  6. YersiniaPestis 1.3.7
  7. Diamond 1.22
  8. Phoenix 1.025
  9. CunobelinDC 0.1
  10. Pear 0.62.1

» Nat | Talk » 16:37, 2 August 2009 (UTC)

I don't believe there is any reason to hurry, ABC will probably vote until next week, so unless anyone is against it I think we sould wait. --Navajo 19:04, 2 August 2009 (UTC)

Sorry for not voting, I'm on hollidays for 2 weeks. I trust you guys will pick the best bots for this challenge. If there are indecisions, I vote for the fastest reference bots. --ABC 08:59, 3 August 2009 (UTC)

Definitely no changes, Phoenix is the 13th fastest. So we can finalize this vote now, can't we? » Nat | Talk » 09:18, 3 August 2009 (UTC)