Difference between revisions of "Talk:Shadow"

From Robowiki
Jump to navigation Jump to search
 
(6 intermediate revisions by 4 users not shown)
Line 25: Line 25:
  
 
:: Very important point. The fact that melee rankings are highly sensitive to the crowd around when the bot was released, really really bugs me. Now... this issue would be '''very much''' improved if ranking was done with a 'Condorcet' system... :) --[[User:Rednaxela|Rednaxela]] 22:32, 3 September 2009 (UTC)
 
:: Very important point. The fact that melee rankings are highly sensitive to the crowd around when the bot was released, really really bugs me. Now... this issue would be '''very much''' improved if ranking was done with a 'Condorcet' system... :) --[[User:Rednaxela|Rednaxela]] 22:32, 3 September 2009 (UTC)
 +
 +
::: I have several thoughts on this point. First and foremost, there is obviously more randomness and volatility in the melee rankings than in the 1v1 rankings, and if we could come up with a fair way to reduce that with a better system, I'm all for it. I'm not convinced a Condorcet system is what we want, since we have traditionally defined success in terms of score %, not just winning and losing, but that's a discussion in and of itself. =)
 +
::: On the other hand... Given how much randomness there clearly is in the battle selection, I've found the MeleeRumble to be remarkably consistent and stable. I've had a few oddball versions here and there, but I've had that in 1v1, too. I hope that you'll find the same to be true with your new melee bot. Also, I really feel that the very essence of Melee is putting your bot in a really diverse set of circumstances and seeing how well he performs. You need to crush 9 sample bots as efficiently as possible, go 1v1 against Shadow, and everything in between. This is necessarily dependent on the crowd, and while the resulting randomness may make for a slightly more volatile ranking, it is also a huge component of Melee's charm. I really don't mean this to sound rude, but my feeling is: if you don't want to be affected by the other bots, that's what 1v1 is for.
 +
::: --[[User:Voidious|Voidious]] 14:32, 4 September 2009 (UTC)
 +
 +
:::: Firstly Voidious, a Condorcet system doesn't require it to just be about 'winning'/'losing'. Consider this: Originally, ELO as used in Chess was just defined in terms of winning/losing but was easily adapted to floating point scores. In exactly the same way, it's actually very trivial to plug floating point scores into a Condorcet system and get floating point results out. For that reason, I disagree with that argument of Condorcet being unsuitable, but I do want to locally test this with real rumble data (that's part of why I want pairing queries) before I really heavily put my support behind it.
 +
:::: I'd agree that melee is remarkably stable/consistent when you release bots close in time to eachother, so it not a problem for judging frequent incremental releases of a bot like Diamond, however my concern lies with when the field has changed very significantly between releases. Being affected by other bots is just fine, ''but'' having a bot have elevated (or deflated) score compared to what it would be if it was re-released exactly the same, has no advantage I can see.
 +
:::: --[[User:Rednaxela|Rednaxela]] 15:19, 4 September 2009 (UTC)
 +
 +
::::: I haven't forgotten your pairing query request -- so far I've only done the easy ones.  But you could also do it the long way: pull the participants list and then query each participant individually. --[[User:Darkcanuck|Darkcanuck]] 15:32, 4 September 2009 (UTC)
 +
 +
::::: Ah, cool. From my limited understanding of the Condorcet stuff, it didn't occur to me it could be adapted to use score percentages, but I'm starting to trust you as the resident brainiac around here. =)
 +
::::: So one of the main problems, as I see it, is (e.g.) when I post a new version of Diamond, Shadow and everyone else will also get lots of battles, but all of their battles will include Diamond. So however specialized Diamond is against one of them will affect all their other scores. I'm not sure how any scoring magic could change this..? What obviously could help is to make sure everyone gets random battles that don't include the new bot, which we can achieve by running our clients when no new bots are waiting for battles.
 +
::::: I'm not really convinced that one bot's specialization can have a measurable influence on other bots' relative scores in this way, but it obviously has some impact. And in the Diamond/Shadow example, I'm definitely not convinced that having another strong bot in a battle will reduce your overall score. In the MiniBot/MicroBot examples, their scores tend to go down when they get to all-mini/all-micro battles (i.e., when the field gets weaker). The other strong bot will keep the weaker bots from getting points, potentially giving you a larger score-% against all 8 other bots. (This is an interesting topic, but we are way off-topic now, so maybe we should cut this discussion onto another page...)
 +
::::: --[[User:Voidious|Voidious]] 18:12, 4 September 2009 (UTC)
  
 
:: Hehe, "le Marquis de Condorcet", the things you learn in the Wikipedia. :) Maybe I'll try a renamed 3.83, so we can be sure about the volatility, or lack thereof, of the current method. --[[User:ABC|ABC]]
 
:: Hehe, "le Marquis de Condorcet", the things you learn in the Wikipedia. :) Maybe I'll try a renamed 3.83, so we can be sure about the volatility, or lack thereof, of the current method. --[[User:ABC|ABC]]
Line 31: Line 46:
  
 
::: Yes but what we(or at least I) wonder is, ''is Shadow 3.83 worse than Shadow 3.84 or not, and in case it is worse, is it worse enough to be under Diamond 1.31?''. Being completely honest I don't think Diamond 1.31 is better than Shadow 3.83, but it wouldn't be completely inexplicable if it were even if 3.83 has a better APS in common pairings because of that volatility. --[[User:Zyx|zyx]] 08:19, 4 September 2009 (UTC)
 
::: Yes but what we(or at least I) wonder is, ''is Shadow 3.83 worse than Shadow 3.84 or not, and in case it is worse, is it worse enough to be under Diamond 1.31?''. Being completely honest I don't think Diamond 1.31 is better than Shadow 3.83, but it wouldn't be completely inexplicable if it were even if 3.83 has a better APS in common pairings because of that volatility. --[[User:Zyx|zyx]] 08:19, 4 September 2009 (UTC)
 +
 +
:::: It's a valid concern. The other bots clearly have an effect, evidenced by what happens to MiniBots and MicroBots. (mini.[[Griezel]] dropped quite a bit in all-mini battles, and [[BlitzBat]] has dropped in all-mini/all-micro battles each time, I think.) But given that Shadow 3.383 is not that old and only a few bots have changed, I don't think there's quite that much volatility here... --[[User:Voidious|Voidious]] 14:32, 4 September 2009 (UTC)
 +
 +
::::: Only a few bots have changed in numbers, but it only takes a couple big strong ones to make a difference. Given the rumble masses, the chance of any particular battle of Shadow's also having Diamond, is 3.26%, same for Portia. Therefore the chance of having one or both of Diamond and Portia in any of Shadow's battles is 6.42%. Suddenly having new strong contenders in 6.42% of battles could really have a measurable affect, at ''very'' least enough to account for the difference between Shadow 3.83 and 3.84d. --[[User:Rednaxela|Rednaxela]] 15:19, 4 September 2009 (UTC)
 +
 +
:::::: I don't think you even need to look that far to explain the difference between 3.83 and 3.84d. A 0.12% difference is surely within the margin of error even for two bots released into the same exact field... --[[User:Voidious|Voidious]] 18:30, 4 September 2009 (UTC)
 +
 +
:::::: Not only is it inside the margin of error, it is not the exact same code. I'm convinced that an exact copy of 3.83 would still score much closer to its previous score. --[[User:ABC|ABC]] 19:10, 4 September 2009 (UTC)
 +
 +
I've noticed when shadow fights 1 vrs 1 it makes an effort to narrows its angle (it goes closer to the wall)  a bot is running along .
 +
Just wanted to say its a great move :) It completely expoits simple orbit schemes like in DemonicRage. :)  --[[User:Jlm0924|Jlm0924]]

Latest revision as of 06:40, 10 September 2009

ABC, do you mind if I change the information on the bot's page to more updated one? Such as 'Wave Surfing' for one-one-one movement. » Nat | Talk » 08:19, 25 July 2009 (UTC)

Be my guest, this text hasn't been updated since forever. --ABC 10:06, 25 July 2009 (UTC)

So can I update it or not? » Nat | Talk » 11:06, 25 July 2009 (UTC)

Yes, you can. --ABC 11:34, 25 July 2009 (UTC)

Thanks, it looks great! I It would take me forever to learn all this mediawiki stuff. And the "A long time ago..." release date is really funny, and true. :) --ABC 13:08, 25 July 2009 (UTC)

No Problem =) I don't think you can remember your release date, not sure if you still have abc.Shadow_1.0.jar in your hard drive =) » Nat | Talk » 14:00, 25 July 2009 (UTC)

v3.84

I just downloaded your new version, and the movement seems quite different from 3.83. I'm curious: did you add the bullet detector you said you were thinking about? --Positive 23:26, 24 July 2009 (UTC)

Yes I did. It took me a while to tweak it to preform as good as the "old" movement in my tests, but great fun nevertheless. :) --ABC 23:37, 24 July 2009 (UTC)

Congrats man, you really make it look easy to take the melee throne back, almost a full 1% of APS above the 2nd with much sporadic updates compared to Diamond's high activity. --zyx 19:39, 3 September 2009 (UTC)

He never lost the melee throne, really... Shadow 3.83 is still the strongest melee bot, and this latest jump involved reverting to that movement. [1] Not that I disagree. Positive's quick ascension of the melee rankings has also made it look way easier than I've found it to be. --Voidious 19:45, 3 September 2009 (UTC)
What Voidious said. It's not been easy improving what I had in 3.83, quite the contrary, everything I try preforms worse. 3.84d is practically a copy of 3.83... --ABC 21:50, 3 September 2009 (UTC)
All that is true, but I guess that both Positive and Voidious use Shadow 3.83 as part of their testing, so it wouldn't be crazy to think that both of them perform better against 3.83 and in general, than the versions that 3.83 actually fought in the rumble before 3.84 series. So is actually hard to compare, even with the common pairings APS (where 3.83 is winning), I think that kind of comparisons are quite hard on Melee since little changes in the rumble can affect many scores. I couldn't assure the current version is better or worse than 3.83, and probably 3.83 would be the king (not counting 3.84X), but still think ABC makes it look kind of easier than it actually is. --zyx 22:01, 3 September 2009 (UTC)
Very important point. The fact that melee rankings are highly sensitive to the crowd around when the bot was released, really really bugs me. Now... this issue would be very much improved if ranking was done with a 'Condorcet' system... :) --Rednaxela 22:32, 3 September 2009 (UTC)
I have several thoughts on this point. First and foremost, there is obviously more randomness and volatility in the melee rankings than in the 1v1 rankings, and if we could come up with a fair way to reduce that with a better system, I'm all for it. I'm not convinced a Condorcet system is what we want, since we have traditionally defined success in terms of score %, not just winning and losing, but that's a discussion in and of itself. =)
On the other hand... Given how much randomness there clearly is in the battle selection, I've found the MeleeRumble to be remarkably consistent and stable. I've had a few oddball versions here and there, but I've had that in 1v1, too. I hope that you'll find the same to be true with your new melee bot. Also, I really feel that the very essence of Melee is putting your bot in a really diverse set of circumstances and seeing how well he performs. You need to crush 9 sample bots as efficiently as possible, go 1v1 against Shadow, and everything in between. This is necessarily dependent on the crowd, and while the resulting randomness may make for a slightly more volatile ranking, it is also a huge component of Melee's charm. I really don't mean this to sound rude, but my feeling is: if you don't want to be affected by the other bots, that's what 1v1 is for.
--Voidious 14:32, 4 September 2009 (UTC)
Firstly Voidious, a Condorcet system doesn't require it to just be about 'winning'/'losing'. Consider this: Originally, ELO as used in Chess was just defined in terms of winning/losing but was easily adapted to floating point scores. In exactly the same way, it's actually very trivial to plug floating point scores into a Condorcet system and get floating point results out. For that reason, I disagree with that argument of Condorcet being unsuitable, but I do want to locally test this with real rumble data (that's part of why I want pairing queries) before I really heavily put my support behind it.
I'd agree that melee is remarkably stable/consistent when you release bots close in time to eachother, so it not a problem for judging frequent incremental releases of a bot like Diamond, however my concern lies with when the field has changed very significantly between releases. Being affected by other bots is just fine, but having a bot have elevated (or deflated) score compared to what it would be if it was re-released exactly the same, has no advantage I can see.
--Rednaxela 15:19, 4 September 2009 (UTC)
I haven't forgotten your pairing query request -- so far I've only done the easy ones. But you could also do it the long way: pull the participants list and then query each participant individually. --Darkcanuck 15:32, 4 September 2009 (UTC)
Ah, cool. From my limited understanding of the Condorcet stuff, it didn't occur to me it could be adapted to use score percentages, but I'm starting to trust you as the resident brainiac around here. =)
So one of the main problems, as I see it, is (e.g.) when I post a new version of Diamond, Shadow and everyone else will also get lots of battles, but all of their battles will include Diamond. So however specialized Diamond is against one of them will affect all their other scores. I'm not sure how any scoring magic could change this..? What obviously could help is to make sure everyone gets random battles that don't include the new bot, which we can achieve by running our clients when no new bots are waiting for battles.
I'm not really convinced that one bot's specialization can have a measurable influence on other bots' relative scores in this way, but it obviously has some impact. And in the Diamond/Shadow example, I'm definitely not convinced that having another strong bot in a battle will reduce your overall score. In the MiniBot/MicroBot examples, their scores tend to go down when they get to all-mini/all-micro battles (i.e., when the field gets weaker). The other strong bot will keep the weaker bots from getting points, potentially giving you a larger score-% against all 8 other bots. (This is an interesting topic, but we are way off-topic now, so maybe we should cut this discussion onto another page...)
--Voidious 18:12, 4 September 2009 (UTC)
Hehe, "le Marquis de Condorcet", the things you learn in the Wikipedia. :) Maybe I'll try a renamed 3.83, so we can be sure about the volatility, or lack thereof, of the current method. --ABC
For absolute APS score you are right, the crowd at time of release has influence. For ranking you are wrong, an ancient bot (when active) and a fresh one do reside on the rank they belong. The APS score does not 'drift', it is a natural process because the crowd becomes better and better due to new opponents (unless someone tries to 'beat' Moron) --GrubbmGait 07:36, 4 September 2009 (UTC)
Yes but what we(or at least I) wonder is, is Shadow 3.83 worse than Shadow 3.84 or not, and in case it is worse, is it worse enough to be under Diamond 1.31?. Being completely honest I don't think Diamond 1.31 is better than Shadow 3.83, but it wouldn't be completely inexplicable if it were even if 3.83 has a better APS in common pairings because of that volatility. --zyx 08:19, 4 September 2009 (UTC)
It's a valid concern. The other bots clearly have an effect, evidenced by what happens to MiniBots and MicroBots. (mini.Griezel dropped quite a bit in all-mini battles, and BlitzBat has dropped in all-mini/all-micro battles each time, I think.) But given that Shadow 3.383 is not that old and only a few bots have changed, I don't think there's quite that much volatility here... --Voidious 14:32, 4 September 2009 (UTC)
Only a few bots have changed in numbers, but it only takes a couple big strong ones to make a difference. Given the rumble masses, the chance of any particular battle of Shadow's also having Diamond, is 3.26%, same for Portia. Therefore the chance of having one or both of Diamond and Portia in any of Shadow's battles is 6.42%. Suddenly having new strong contenders in 6.42% of battles could really have a measurable affect, at very least enough to account for the difference between Shadow 3.83 and 3.84d. --Rednaxela 15:19, 4 September 2009 (UTC)
I don't think you even need to look that far to explain the difference between 3.83 and 3.84d. A 0.12% difference is surely within the margin of error even for two bots released into the same exact field... --Voidious 18:30, 4 September 2009 (UTC)
Not only is it inside the margin of error, it is not the exact same code. I'm convinced that an exact copy of 3.83 would still score much closer to its previous score. --ABC 19:10, 4 September 2009 (UTC)

I've noticed when shadow fights 1 vrs 1 it makes an effort to narrows its angle (it goes closer to the wall) a bot is running along . Just wanted to say its a great move :) It completely expoits simple orbit schemes like in DemonicRage. :) --Jlm0924