Difference between revisions of "Anti-Surfer Challenge/Pre-Chat"
(comment on scores, SaphireEdge) |
|||
Line 298: | Line 298: | ||
: Well, I alway think that Druss is faster than Shadow. » <span style="font-size:0.9em;color:darkgreen;">[[User:Nat|Nat]] | [[User_talk:Nat|Talk]]</span> » 13:35, 24 July 2009 (UTC) | : Well, I alway think that Druss is faster than Shadow. » <span style="font-size:0.9em;color:darkgreen;">[[User:Nat|Nat]] | [[User_talk:Nat|Talk]]</span> » 13:35, 24 July 2009 (UTC) | ||
+ | |||
+ | Thanks again for the rest results, Navajo! I'll try to put together TC versions of the other nominated bots today and run some more of the missing battles. Any other guns or tests you guys want to see? | ||
+ | |||
+ | @Rednaxela: Sure, I'd love to see its scores! But if you don't want SaphireEdge in the wild yet, I think Dooki's gun is an OK substitute for "super strong VCS gun". =) I figured having the very best guns plus a couple weaker guns (like Komarious) would let us see which bots are hardest to hit, and also which bots require some anti-surfer elements to hit better. For example, even if the top guns hit GT and CC well, Komarious' low scores indicate you need some anti-surfer element to hit them. But for PulsarMax, even Komarious hits him pretty well, so maybe he's not a good one. | ||
+ | |||
+ | --[[User:Voidious|Voidious]] 15:52, 24 July 2009 (UTC) |
Revision as of 16:52, 24 July 2009
Pre-Chat
While it's not that useful for increasing RoboRumble performance, a lot of us enjoy trying to improve our Anti-Surfer Targeting. It's a big part of claiming the PL throne. We have surfers in the TC2K6 and TC2K7, but this would be an opportunity to update and expand the test bed. My initial thoughts:
- I actually think the TC2K7 surfers are all good candidates. I'd definitely vote to keep Shadow (classic, awesome DC-surfer) and CassiusClay (classic, awesome VCS-surfer). Hydra (related to WaveSerpent) is still the top DC surfer.
- In the present RoboRumble, 6 of the top 10 and 11-12 of the top 20 (not sure about RougeDC) are using DC surfing, so we should have more of those.
- How many reference bots? 10? 12? 15? I say start at 10 and if there are specific bots we really want, consider increasing it.
- Classic TC rules OK?
- 35 rounds or 500 rounds? I like 500, myself, but 35 seems logical.
We could nominate, vote, then maybe discuss and have follow-up votes for tweaking the final set, to make sure we get the desired mix.
Anyone else interested? =)
--Voidious 15:16, 10 July 2009 (UTC)
I prefer 35 rounds. 10 bots should be enough. Surfers are slow, I don't want to let my computer run for 3 days to run 15 seasons of this challenge =) We should have 5 DC surfer and 5 VCS surfer. It would be good if we can have the not-yet-exist mixed surfer. I think we can ask Skilgannon to add the DC to DrussGT. I think we should have Engineer as a NN surfer too. » Nat | Talk » 15:30, 10 July 2009 (UTC)
- Engineer is interesting, but it's not open source, and I'm not sure it would be a good benchmark because it's so unique. I'd personally rather have the real DrussGT than a DC version, but Wintermute seems a fine candidate. While I generally aim for fast-executing reference bots, in this case I'd vote for just getting the right reference bots and accepting the CPU cost. Benchmarking Anti-Surfer Targeting takes time, that's life. =) --Voidious 15:46, 10 July 2009 (UTC)
- I nominate the both VCS and DC DrussGT, not just DC DrussGT, it should yield in an interesting yet slow wave surfing movement that should not be able to pair with any guns =) Actually I'm acceptable with the current-generation wave surfer, I just don't want to have 10 reference robots that are as slow as Cigaret. » Nat | Talk » 17:25, 10 July 2009 (UTC)
As a testbed I propose:
- DC:
- YersiniaPestis (very hard to hit, has multiple weighting schemes, flattener, also quite slow, I'm not sure about movement alone though)
- Shadow (hard to hit, has flattener, runs fast)
- Hydra (possibly vulnerable to antisurfer guns?)
- Horizon or RougeDC (intermediate DC implementation)
- CunobelinDC (very simple DC implementation)
- VCS:
- DrussGT (we need a goto bot, but they aren't very common - otherwise possibly Silversurfer due to execution speed reasons)
- Phoenix (sorry Voidious, but Dookious's movement is really slow =) )
- CassiusClay (classic)
- Garm --------?? does Garm support movement only?
- Komarious (essentially CunobelinDC's VCS counterpart)
Perhaps cull out Hydra and Garm, but I think that covers things fairly well. Feedback? --Skilgannon 16:38, 10 July 2009 (UTC)
- I prefer Horizon over RougeDC if we have Garm, otherwise I'd prefer RougeDC since I think we need at least one wave surfer with precise-intersection, though I'm not sure if both Garm and RoougeDC has it in movement. But perhaps we can have Wintermute instead of Garm, unless we don't want to have 3 robots from Skilgannon. But I myself think Phoenix is slower than Dookious, but I only test with the original version, not the movement only. I wonder if we can have a KDTree version of CunobelinDC? I think at least it will be faster, but I'm not sure if it will effect the score. » Nat | Talk » 17:25, 10 July 2009 (UTC)
(Edit conflict) Most of those bots I agree with 100%, though I still say we should vote in the end. Some responses:
- I like RougeDC over Horizon because it's so much stronger in PL.
- I always thought Komarious would be a perfect reference bot, but at this point I think she's just too weak. I like CunobelinDC since it's strong and probably a super-fast DC surfer, but my vote would be for that to be the weakest reference bot. Dooki's Main Gun way outperforms his Anti-Surfer Gun against Komarious.
- I know Dookious is slow, and in the past I've always agreed that he therefore shouldn't be a reference bot. This time I'm not sure. The PL is dominated by 4-5 bots and he's one of them. I'll have to benchmark the speed, but I thought it was in the same ballpark as Shadow (~half as fast as CC).
- We'd need a movement-only Phoenix, but I think David would provide that. (I may even have one already, I'll check.)
- Diamond could also be worth considering, he's actually ahead of Hydra in PL (if barely =)).
- Looks like Garm is not open source, dang. Maybe PulsarMax? I think he can do movement-only from the .properties file, we could repackage a TC version.
- I personally think we should really push for the strongest bots possible in this. CPU-taxing or not, those are the movements you need to hit to have a strong Anti-Surfer gun. I'd love to have 9 of the strongest possible movements, then CunobelinDC as the only intermediate one. Just my 2 cents on that.
--Voidious 17:43, 10 July 2009 (UTC)
Please note that the current Robocode 1.7.3 isn't extracting the .properties file due the bug. Just FYI.
I wonder if we can have both Dookious and Phoenix since we don't have Komarious now (just figure out how bad she do in PL =)). I fully agree to Voidious' $0.02, we can have YersiniaPestis, Shadow, RougeDC, Hydra? I'd prefer Gauss here but I'm not sure if it from his gun or movement, and CunobelinDC for DC and DrussGT, Dookious, Phoenix, CassiusClay and another bot for VCS. I'd say it would be PulsarMax though, because Ascendent and WaveSerpent is too easy to hit. I'd say we are lack of good VCS surfer bot right now ;-) » Nat | Talk » 18:44, 10 July 2009 (UTC)
I think this is turning into more of a PL-challenge. If we want anti-surfer then we need to include more intermediate/basic surfing bots, possibly even a BasicSurfer, but bots that cover all lengths of the surfing spectrum. If we want a PL challenge we should probably just take the top 10 bots from the PL (maybe excluding Ascendant because his score is due to gun, not movement?).--Skilgannon 22:49, 10 July 2009 (UTC)
Hmm... What you say makes sense, of course. But I almost don't think of a bot like Komarious as requiring "Anti-Surfer Targeting" to hit -- normal (ie, non-decaying) learning guns hit her just fine. But maybe I'm being narrow-minded, and a truly effective anti-surfer technique should work well against her, too. Interested to hear some others weigh in.
On a different note, holy crap, Dooki's movement sure is slow. Here's the times on my system in 500 rounds against Komarious's gun:
- Dookious 1.541 - 11:55
- CassiusClay 2pi - 4:22
- Shadow 3.66d - 5:50
Maybe we should leave him out after all...
--Voidious 23:21, 10 July 2009 (UTC)
Some interesting bots proposed above. Personally I don't care so much about execution speed, as long as the reference set is limited to 10 bots and battles last only 35 rounds. Or include some more basic (faster) surfers and expand the group to 15? I'd suggest increasing the number of seasons to 30-50, since even at 15 there's at least 0.5 percent variability. I'd also like to see Dookious in the set as that's a top surfer. The others should be as up-to-date as possible, so Shadow 3.83(c?) rather than 3.66 for example. Looking forward to it! --Darkcanuck 02:33, 11 July 2009 (UTC)
Is there any difference between Shadow 3.83 and Shadow 3.83c? Anyway, I think we should have Gauss in for a second GT surfer. I still want Dookious in, it learn faster than Phoenix. I usually win Phoenix and Shadow, but not Dookious in the first round. And I think Dookious probably only one surfer that use non-firing waves.
Do you think we should disable the flattener for the reference bot? » Nat | Talk » 04:31, 11 July 2009 (UTC)
I think flattener should be disabled, and that testing against flatteners should be a seperate challange or something. I says this because I consider the problem of hitten a flattener to be very different than the problem of hitting a normal surfer. --Rednaxela 17:58, 11 July 2009 (UTC)
While I agree it's a different problem (or sub-problem) in some respects, I'd vote to keep them on because I think it's an important part of hitting an advanced surfer. --Voidious 19:22, 11 July 2009 (UTC)
If we have it on, I'd suggest either always on, or always off, for any particular bot. Or perhaps one copy of the bot with it off and one copy with it always on. The reason I say this is that it will make the results more consistent. Otherwise, you can have the scenario where an improvement that allows you to hit a surfer better, may make it decide to turn it's flattener on and suddenly kill your score. While surfers ideally should always have the flattener on if it would help them, most have a bias towards disabling it in order to avoid lucky shots by weak bots triggering it. With the sudden jumps possible when bots enable flatteners at funny times, it could make scores too difficult to interpret, when your gun passes the boundry of the enemy deciding to use flattener on it. --Rednaxela 20:41, 11 July 2009 (UTC)
I agree it will create that possibility, I just can't get past the fact that we'd basically be ignoring a real problem faced in trying to target surfers. Optimizing against flattener or non-flattener Dookious (or even both, separately) may not increase your accuracy against the real Dookious. There is also the fact that we could not do this for Shadow or PulsarMax. But I'm cool with whatever is decided, maybe we can vote on this, too, if there's a lot of disagreement. --Voidious 21:21, 11 July 2009 (UTC)
(edit conflict) I'm a big advocate of letting the bots run as they would normally. To get good targeting performance against a surfer, you really need to consider their whole behaviour, which may include flatteners and the decision points used for turning them on and off. --Darkcanuck 21:25, 11 July 2009 (UTC)
The problem with the bots running as-is, is it may cause people to over-optimize towards making the gun just inaccurate enough against it to not trigger a bot's flattener. That kind of optimization will only really help against that very specific testbed bot, which leads to a misleading situation which I consider a very very bad thing. I really think the optimal solution would be for every bot with flattener, to include seperate battles for 1) Natural behavior, 2) Flattener always off, and 3) Flattener always on. Having all three numbers allows one to consider their whole behavior, while making the resulting scores more clearly indicate what's happening. Of course, this does increase how much time tests would take, but I'd consider it worthwhile. --Rednaxela 23:46, 11 July 2009 (UTC)
- While I still think we should just use the natural versions, I do recognize that you're making some important points here, btw. Some thoughts in response:
- If you're optimizing against one bot, you may indeed find your best score (for now) is to hit less and not trigger the flattener. But I don't see how you could do that against the whole test bed at once, so I don't think you could optimize your overall score that way with this challenge. (Even if you could, though, I still see that as the reality of targeting these bots.)
- I quite agree, however, that one could gain valuable information from the flattener-only and non-flattener versions of reference bots. It would be cool to make them available for private testing even if we use the natural versions in the challenge.
- Maybe we could even whip up some common code to implement in the open source bots that would log a few details on flattener use? Like when it was enabled/disabled, hit percentages for each, etc. It wouldn't be hard. Actually that seems like a cool idea for TC reference bots, in general, to implement a common logging API. After the TC runs, you could look at the .log for each bot to glean some info.
- --Voidious 17:11, 12 July 2009 (UTC)
I believe this is Anti-Surfer Challenge, not Anti-Flattener challenge. If we want to test the accuracy of the anti-surfer gun, we shouldn't let the flattener on. While I'd agree that the normal advanced surfers will trigger it's flattener at some point, I think it should call 'Anti-Adaptive Movement Challenge' rather than 'Anti-Surfer Challenge'. » Nat | Talk » 02:53, 12 July 2009 (UTC)
- Since when is using a flattener not "Wave Surfing"? You're still surfing waves, just using a different formula for deciding what's a dangerous point on the waves: where you've gone before instead of (or in addition to) where you've been hit. --Voidious 03:34, 12 July 2009 (UTC)
- I know, they still surf waves. But it require difference targeting technique to aim them. I don't really think that the surfing flattener and random-based flattener have much difference. But the propose of current generation of Anti-Surfer targeting is design for hit-surfing, not visit-surfing, and most of surfer start with hit-surfing, I use term Wave Surfing with just hit-surfing. » Nat | Talk » 04:09, 12 July 2009 (UTC)
- The trouble is, if you can't hit a surfer with the flattener enabled, it's not going to help your PL score. I suggest just leaving the bots how they are, otherwise we'll probably never get around to doing the challenge =) --Skilgannon 10:32, 12 July 2009 (UTC)
Is it time to move onto actual voting? Should I do tests with each nominated bot (timing 35 rounds, scores against a few guns, etc.) before actual voting? Do we need to discuss flattener vs no-flattener some more first? (I think having flattener-only/no-flattener versions of reference bots available for private testing could be really useful, if anyone missed that conversation branch above.) I confirmed that PulsarMax supports challenge mode from the .properties file, btw.
So here's an updated list of bots we could vote on, including some new additions from me. If anyone wants to nominate any other bots, feel free to add them below. I'm not sure we're all in agreement about the range of difficulties we want, but I believe that will work itself out with the voting.
DC | VCS | ||||
|
|
- Leaving out Gauss and WaveSerpent because I think they are extremely similar to YersiniaPestis and Hydra, respectively.
- I think Pear is very slow, but if we are short on VCS bots, I know it is open source and a pretty hard-to-hit movement.
- YersiniaPestis is really slow over 500 rounds (nearly an hour on my system), but it slows down as rounds go on, so I'll have to test 35 rounds specifically.
--Voidious 14:37, 13 July 2009 (UTC)
Nice table =) I think we should have Gauss as second GT surfing. Or are there any other GT surfers in the list? I think we should wait for a day or a couple of day before we start voting. » Nat | Talk » 16:17, 13 July 2009 (UTC)
- I don't think Gauss is Go-To, the Gauss/VersionHistory says True Surfing. SilverSurfer is the only other Go-To bot I know of. I'm OK with adding SS to the nominated bots if you want (though I probably won't vote for him). --Voidious 18:44, 13 July 2009 (UTC)
Wow, there's been lot of activity here. About some things talked here:
- Gauss is very similar to YersiniaPestis, it has some new features but it performs worse anyway, I called them both True Surfing because at the end of every tick the movement comes to a decision of either going CW or CCW around the next wave and not turning directly to some point.
- About the flattener issue, I think the most realistic results come from the bots being in their normal behavior. On the other hand YersiniaPestis' flattener is not an on/off flattener, it has some weight, forcing it to 0 can simulate a flattener always off, but the normal and always on are not separable.
--zyx 20:27, 13 July 2009 (UTC)
Oh I'm confused. The GT Gauss is just in zyx's mind (please reconsider your user page, zyx) I don't think we need SilverSurfer, we need the newer surfer more =) For Flattener issue, I think the current solution is 'leave it alone', correct? » Nat | Talk » 12:26, 14 July 2009 (UTC)
I got a movement-only Phoenix from David - he called it 1.025TC because he couldn't guarantee it was from any particular version, but it's a recent Phoenix. I was next going to run some TC tests against the nominated bots, to see some scores and execution speed before voting, but my CPU time has been occupied with the User:Voidious/Robocode Version Tests instead. --Voidious 15:16, 21 July 2009 (UTC)
I believe I can help you run these tests, I just can't provide any reliable data on excecution speed because my computer somehow does not have a stable performance. All I need to know is what tests do you want and which robocode version to use --Navajo 17:40, 21 July 2009 (UTC)
- I hate to make the new guy run tests, I'm sure you have a lot of stuff to explore. =) But I'll post some of the TC-mode bots when I get a chance, so anyone can run battles if they want. I definitely recommend RoboResearch for running batch battles, if you haven't checked it out yet. --Voidious 23:37, 21 July 2009 (UTC)
- I do have a lot of stuff to explore, but right now nothing that uses much CPU, I've finished my robocode tests by now and I will probably not have any new tests to run until next month. By now I only use roboleague for this, all my attempts to run RoboResearch have failed, I don't know how to use svn, my only computer with linux is too slow and I don't know how to run or compile java in windows, I only use java to robocode, for everything else I use C.--Navajo 00:05, 22 July 2009 (UTC)
- Cool, much appreciated. I posted a .zip with .class files of RoboResearch + my Melee support at Talk:RoboResearch#Melee, here: void_roboresearch_melee_01.zip. That would save you the SVN and the compiling if you want to try it. --Voidious 14:40, 22 July 2009 (UTC)
- This link seems to be broken, but I've downloaded the zip file you made avaible in the Getting started instructions section of that page and for the first time I've gotten RoboResearching running. Thank you, this was very helpful, that feature of running 2 threads at the same time is great. Now I only need the TC-mode bots to help you run the tests. By the way, will it be 35 rounds? --Navajo 21:40, 22 July 2009 (UTC)
- Yep, 35 rounds. 30-50 battles gives a decently accurate score for a given bot. Strange, that link works for me... dijitari.com resolves to 66.96.145.105 for me when I ping it, maybe it's a DNS issue? --Voidious 04:35, 23 July 2009 (UTC)
Ok, I put together a few of the challenge versions of the nominated bots. Sorry it's not all of them, but it's a start.
- Nominated reference bots:
- Sample guns:
I know 3 of those guns are mine. :-P But DrussGT is the top DC gun; Dookious is close to the top VCS gun (SaphireEdge isn't available for download, right?); Diamond is a good DC gun and easy for me to whip up; and Komarious I already had, and it's a good general purpose but bad anti-surfer gun. I think DrussGT and Komarious are probably the most important to test with among those four. Any other thoughts on pre-voting tests you guys want to see?
--Voidious 04:35, 23 July 2009 (UTC)
- Actually, since I don't have any Robocode Alpha tests lined up, I'll run some battles with DrussGT gun vs the above TC candidates overnight... --Voidious 05:19, 23 July 2009 (UTC)
Challenger | Sha | Chk | Phx | GT | CC | PMx | Dia | Dki | YP | Seasons |
DrussGT 1.3.8 | 74.92 | 66.78 | 71.68 | 71.91 | 70.34 | 78.88 | 65.14 | 58.35 | 12.59 | 50.0 |
I must've screwed up the YersiniaPestis TC version, I'll check it out... (Might be time to organize this page with sections.) --Voidious 14:31, 23 July 2009 (UTC)
- YersiniaPestis has a config file in it's data directory, if the test property is set to mc it will be movement only, and if set to tc is gun only with firepower 3. --zyx 14:55, 23 July 2009 (UTC)
- Yeah, I set it and watched it working in "MC" mode. I guess Robocode didn't package the data dir? Shadow and PulsarMax I did by hand with zip/unzip... doh. I'll fix tomorrow if someone else doesn't, will be afk this evening. --Voidious 15:02, 23 July 2009 (UTC)
- Is DrussGT not in the TC2K7? That's a a nice score vs Shadow... --Darkcanuck 15:37, 23 July 2009 (UTC)
Challenger | SHA | CHK | DIA | PHX | GT | CC | PMX | DKI | Seasons |
Komarious 1.842 | 66,81 | 58,03 | 55,97 | 59,47 | 51,99 | 56,31 | 73,78 | 53,87 | 50 |
RoboResearch didn't even run YersiniaPestis. About the times, every bot took between 1:00 and 1:30 minutes to run each battle, except for Dookious and Phoenix. Dookious took between 3:00 and 3:30 minutes and Phoenix between 5:00 and 6:00 minutes. I'm going to run battles with Diamond now. By the way, does RoboResearch save those results anywhere? If I have to stop running a test, can I start it againg later from where I stopped?--Navajo 20:12, 23 July 2009 (UTC)
- Thanks much for running those battles! The results are stored in a SQL database, so yes, you can stop and start all you like. If you ever want to clear the database, use database_gui.sh (or in Windows, rename it .bat and replace : with ; within it, I think) and do "delete from robo_research.battles;" and "delete from robo_research.bots;" while the database is running (database_server.sh, which you may know already if you tried 2 threads). --Voidious 20:26, 23 July 2009 (UTC)
- (edit conflict) I uploaded a YersiniaPestis movement only version and putted the link above in Voidious' list, try that one and let me know if you still have problems, I now that some versions of Robocode are picky when the properties file have a different robot version than the one on the file name. --zyx 20:28, 23 July 2009 (UTC)
Voidious: To reply to your note about SaphireEdge, no it's not currently uploaded anywhere but if you wish I could package it up and upload it if you're interested :) Also, while RougeDC was origionally a proposed challenge bot, I'd be against it being a challenge bot due to really not being proud of it's movement. Or then again... maybe it's hillarious flawed design would give samples of surfer weakness for anti-surfer guns to take advantage of? Haha, I don't know. --Rednaxela 23:36, 23 July 2009 (UTC)
- Rednaxela did you modify an old version of this page, or did you remove the posts and changed the link intentionally? --zyx 23:43, 23 July 2009 (UTC)
- Stupid accident. Was looking at an old diff and hit edit. --Rednaxela 23:45, 23 July 2009 (UTC)
Challenger | SHA | CHK | DIA | YP | PHX | GT | CC | PMX | DKI | Seasons |
Dookious 1.572cMC | 71,31 | 71,91 | 64,64 | 78,01 | 68,66 | 68,12 | 75,21 | 83,57 | 52,53 | 50,0 |
Time | 1:38 | 2:27 | 2:24 | 3:27 | 6:03 | 2:18 | 2:02 | 2:20 | 4:00 |
YersiniaPestis ran perfectly. Those time are from the battles of the last 2 seasons. --Navajo 12:51, 24 July 2009 (UTC)
Phoenix is slower than Dookious. =( Why don't we use Firebird instead? (just joking =)) Shadow's the fastest??? I can't believe it! » Nat | Talk » 13:33, 24 July 2009 (UTC)
Doesn't surprise me at all that Shadow is the fastest. Though I never timed it, Shadow had always just seemed very fast in my experience. --Rednaxela 13:20, 24 July 2009 (UTC)
Thanks again for the rest results, Navajo! I'll try to put together TC versions of the other nominated bots today and run some more of the missing battles. Any other guns or tests you guys want to see?
@Rednaxela: Sure, I'd love to see its scores! But if you don't want SaphireEdge in the wild yet, I think Dooki's gun is an OK substitute for "super strong VCS gun". =) I figured having the very best guns plus a couple weaker guns (like Komarious) would let us see which bots are hardest to hit, and also which bots require some anti-surfer elements to hit better. For example, even if the top guns hit GT and CC well, Komarious' low scores indicate you need some anti-surfer element to hit them. But for PulsarMax, even Komarious hits him pretty well, so maybe he's not a good one.
--Voidious 15:52, 24 July 2009 (UTC)