ScalarR/Some Ideas
This article is a stub. You can help RoboWiki by expanding it. |
I’m not going into full details for now, but rather some crucial ideas that drives ScalarR and have been proved to work well. Instead of encouraging people to follow the ScalarR way, I believe that sharing initial motivations could inspire even more innovations, and push the bar of top bots even further.
Contents
Targeting
Anti-General/Random
By anti-general movement / anti-random movement, a basic assumption is that the opponent has a finite & fixed set of movements. Once the specific movement is determined, data points are iid. Then either some offline learning, online learning or a mix of both works pretty well. The only trick is to make targeting space (e.g. guess factor & corresponding features) and physical space (robocode physics) consistent. To be more specific, maintain accurate 1-to-1 correspondence between both spaces.
Anti-Surfing
The problem of traditional targeting against surfers is that all observations are biased (since data points aren't iid at all). Using hit/collide waves? Biased. Using tick waves? Biased. Using historical waves only? Biased. Using recent waves only? Biased. Anyway being biased isn't a bad thing, a lot of guns work by adding bias manually, and it works. ScalarR isn't doing anything truly special right now.
Anti-Flattening
Anti-Flattening is a different story than Anti-Surfing. To hit surfers, you need recency. To hit flatteners, you need something to exploit weakness (e.g. repeated patterns), otherwise random targeting works sufficiently well. Since flattening is generally used with traditional surfing, things in anti-surfing generally mix-in. ScalarR isn't doing anything truly special right now.
Surfing
Surfing is another story, a complete different story. But if I was asked to give Wave Surfing another name, I would simply choose Minimum Risk Movement, redefined. Back to the simple story where agent (robot) was facing an environment (mainly walls and other bots), which action (mainly movement) will you choose to maximize objective, e.g. survival? This is not a simple question, but many non-simple questions receive some simple solution — why not simply simulate what's going next, and let the simulation result decide? Then many techniques are used, e.g. Precise Intersection, Bullet Shadow/Correct, everything is making the simulation closer to what's truly happening, by making it more precise, and more exact.
Basic Movement
Most robots are calling setAhead & setTurn, and leave the reset of movement to robocode. ScalarR is directly controlling velocity & turn rate instead, and is encapsulating details into movement drivers (e.g. goto driver). By doing this, ScalarR can use faster methods to derivate velocity & turn rate, making movement prediction easier, faster and preciser. Wall smoothing is also encapsulated into drivers, so that I can directly control full details near wall, avoid losing score occasionally.
Melee Surfing
I can't see real differences between melee and 1v1, if wave surfing is redefined as some form of minimum risk. So I'm sharing most code between melee and 1v1, with only more paths to evaluate in melee. Things like second-wave surfing can be reconsidered as some fast approximation of searching all possible branches, by assuming going in straight line between waves. Actually you generally don't have enough ticks for some more complex movement, so this approximation is just fine. And to solve edge cases, I'm not really distinguishing things like first / second waves, but just summing the risks along the path, weighted by time to arrive.
1v1 Surfing
Once you had the melee stuff work, adding some 1v1 surfing is merely some changes in paths generated. Then, danger estimation get divorced, and a lot of details are retuned. But they aren't truly different, right?
Danger Estimation
Instead of inventing and experimenting some novel approach like in targeting, the best way that worked is to test against real opponents, say the entire rumble, since danger estimation is essentially fitting the targeting of existing robots in the rumble. Inventing something fancy doesn't help much, but waching a lot of battles, reading a lot of targeting code does work well. And once most problem bots are solved, you got some fairly good result, done.
Flattening
Flattening can be considered as some special form of danger estimation, when the opponent is either firing with only recent waves, or trying to spot repeated patterns. Firing with only recent waves doesn't work well already against surfers with complex danger estimation, and when they are trying to spot weakness, it's merely racing that who can spot weakness better. At some point, every targeting method is either performing the same as random or worse than random, then done.
Energy Management
Energy Management is not simply Power Selection. By doing energy management, you are solving the same problem as in surfing — what's the best action (now bullet power) that maximizes objective, e.g. survival. This involves precise simulation sometimes but simulations are much harder here given the nature of random process. And most observations are biased, so do most simulations. Some very simple strategy possibly in one line can give you astonishing result, but coming up with it may require days of watching battles, and even more days of failing attempts. And past tuning of energy management gets outdated soon after mutating surf & targeting. So this is generally the last thing I do. And keeping it as simple as possible until the very last time does save you a lot of time.