Difference between revisions of "Talk:WaveSim"
(comment) |
m (moved User talk:Voidious/WaveSim to Talk:WaveSim: I guess this deserves to be in the main namespace) |
||
(19 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
+ | == General == | ||
+ | |||
Still ironing out some issues here and there, but damn this is cool. =) Time to run normal battles, 20 seasons x 48 bots: 4.75 hours (on 2 threads). Time to run the same gun against the raw data: ~10 minutes. :-D Plus you don't have to hope randomness averages out over the battles - it's the same data every time. --[[User:Voidious|Voidious]] 23:37, 13 March 2010 (UTC) | Still ironing out some issues here and there, but damn this is cool. =) Time to run normal battles, 20 seasons x 48 bots: 4.75 hours (on 2 threads). Time to run the same gun against the raw data: ~10 minutes. :-D Plus you don't have to hope randomness averages out over the battles - it's the same data every time. --[[User:Voidious|Voidious]] 23:37, 13 March 2010 (UTC) | ||
Line 4: | Line 6: | ||
I actually wondered if you ever had. =) It's a funny combination of "wow this is so cool!" and "you know this is sooo nothing special." Back when I had access to MATLAB at school, I did play with a wave data set with some SVMs, but other than that I haven't explored testing my classification algorithms outside of Robocode. But I still have the desire to try a lot of clustering experiments, so taking a few days to set this up was well worth it! --[[User:Voidious|Voidious]] 23:59, 13 March 2010 (UTC) | I actually wondered if you ever had. =) It's a funny combination of "wow this is so cool!" and "you know this is sooo nothing special." Back when I had access to MATLAB at school, I did play with a wave data set with some SVMs, but other than that I haven't explored testing my classification algorithms outside of Robocode. But I still have the desire to try a lot of clustering experiments, so taking a few days to set this up was well worth it! --[[User:Voidious|Voidious]] 23:59, 13 March 2010 (UTC) | ||
+ | |||
+ | This has got me thinking. Since the earliest days of Ugluk, the design of the guns and movement have been 'pluggable'. Which is handy because I'd often throw a large set of both against opponents and simply stop using the ones that were least effective. Anyway.. digressing too much.. what I have not yet done is to make the tank completely independent of Robocode, such that with the right input you could run a simulation outside of the client. I can see the benefit of doing this with a recorded set of tank positions, directions, and speeds. Even putting aside the nagging problem of adaptive movements, you can quickly tell if your gun has gone horribly wrong. And of course when testing against non-adaptive movements, you can refine your punishment to squeeze the best point ratios out of your battles, which is what the scoring in the rumble is all about. Defeating good / adaptive bots is secondary. --[[User:Pedersen|Martin]] 21:11, 15 March 2010 (UTC) | ||
+ | |||
+ | QT Clustering sounds interesting. Reminds me of my density suggestion, except without the normal distribution. I wonder if there is a way to dynamically determine the best threshold as well. I would guess 'the point where the density of point to distance becomes less than those nearer to the center' but that is a bit abstract and is useless for building clusters, since not all points or for that matter clusters (with real data) fit this kind of definition. --[[User:Chase-san|Chase]] 10:33, 20 March 2010 (UTC) | ||
+ | |||
+ | My recent gun tests have been against a field of 47 mid-range bots I had in an old [[RoboResearch]] test bed. Last night before bed, I took 5 minutes and used [[User:Voidious/BedMaker|BedMaker]] to create a test bed of 250 bots that [[Diamond]] scores between 70% and 90% against, then started collecting 6 seasons of gun data (1500 battles) against them with [[User:Voidious/WaveSim#TripHammer_RES|TripHammer RES]]. I felt so cool! =) --[[User:Voidious|Voidious]] 16:15, 25 January 2011 (UTC) | ||
+ | |||
+ | Interesting idea here. Might convince me to dust off my bots and give things another go. --[[User:Miked0801|Miked0801]] 13:58, 16 April 2011 (UTC) | ||
+ | |||
+ | : Cool - would of course welcome any feedback you have on usability. Btw it would require some updates to support PM "correctly". Right now, data is only fed to classifiers when a wave is collected. For PM you'd probably just want each TickRecord fed to the classifier as soon as it happens, and the wave collection could be ignored altogether. And you'd want abs heading instead of relative heading, but that could be deduced from relative heading + orbit direction + sign(velocity). Those would all be pretty simple changes so maybe I'll hack that together shortly for 2.0.1. --[[User:Voidious|Voidious]] 00:34, 17 April 2011 (UTC) | ||
+ | |||
+ | : I've added the TickClassifier and also the raw heading to the file format. Probably won't rush it out yet so I can gather a new batch of sample data and write an example PM using TickClassifier, but if you want the dev version (you'd need to collect some data yourself), let me know and I'll gladly post it before the official release. --[[User:Voidious|Voidious]] 03:32, 17 April 2011 (UTC) | ||
+ | |||
+ | == Compression? == | ||
+ | |||
+ | Hmm, since you end up with huge CSV files with this, why not do some compression with GZIPOutputStream? Not only would it save disk space, but I have a feeling it could make WaveSim run faster due to reduced reading from disk. --[[User:Rednaxela|Rednaxela]] 06:18, 26 January 2011 (UTC) | ||
+ | |||
+ | Good call, trying it now with some existing data. First run was 889 uncompressed, 904 gzip - hmm. Still worth keeping for the size, but no speed increase. Started to think just last night how unoptimized this code might be. I added regular MEA and corresponding regular GFs last night and then wrote a method to massage my existing data files into the new format. It was taking forever! Turns out a zillion string appends in a tight loop is bad, and StringBuilder is awesome. | ||
+ | |||
+ | Now I'm wondering if there are some magical incantations for reading these files better. I was using BufferedReader(FileReader()), now it's BufferedReader(InputStreamReader(GZIPInputStreamReader(FileInputStream)))). I tried (when still uncompressed) FileInputStream and reading in the whole file at once, then parsing it, but that was slower. | ||
+ | |||
+ | --[[User:Voidious|Voidious]] 13:35, 26 January 2011 (UTC) | ||
+ | |||
+ | Oh, another issue might be appending to a compressed file. I append at the end of each round because I hit some issues storing up this huge file to write in one burst at the end of the battle. Maybe I'll try all this on a newer Robocode and see if I can figure that issue out too. --[[User:Voidious|Voidious]] 13:41, 26 January 2011 (UTC) | ||
+ | |||
+ | I'd consider reading it all at once, to be expected to be slower. After all, when you only read it part at a time, it gives the hard disk a chance to do readahead while you're processing the data. I don't think the InputStreamReader+FileInputStream makes any difference vs FileReader. | ||
+ | |||
+ | Regarding appending, I believe you can append gzip streams actually. The command line gzip tools seem happy with this anyway. When I do "echo foo | gzip >> blah.gz" and "echo foo | gzip >> blah.gz", then try to read it with "cat blah.gz | gzip -d", it outputs both just fine. I haven't tested, but I presume GZIPInputStream can handle appended streams too. Even if that appending doesn't work, why not just keep the file open across rounds, as a static variable? --[[User:Rednaxela|Rednaxela]] 13:59, 26 January 2011 (UTC) | ||
+ | |||
+ | Cool, this seems to be working, just leaving the output stream open now. I was hitting a "bad file descriptor" after a few minimized rounds, but switching back to Apple JVM (from SoyLatte) fixed it. Some of my data files collected with RoboResearch are hitting an "unexpected EOF" when I process them, but haven't reproduced it with data collected through normal Robocode, so I'm not sure what's going on. It seems to be able to read the data up to that point though. (Using Apple JVM and Robocode 1.7.2.2 for both.) Gonna try to get all of this posted today - with compressed data, normal MEA / wall distance, and random movement in the collector bot, all of which are done but just need testing and polishing. --[[User:Voidious|Voidious]] 15:38, 26 January 2011 (UTC) | ||
+ | |||
+ | Figured it out - my little "go back to Apple JVM" stuff in my RoboResearch shell script wasn't really working, since RoboResearch doesn't use the Robocode control API, it actually launches Java itself. And now I remember why I hate the Apple JVM for RoboResearch - it launches a new "Java app" thing in the dock every battle and steals the focus from your current app. Time to figure out how to tweak that... I hope... Good news is I think this is almost good to go. =) --[[User:Voidious|Voidious]] 16:49, 26 January 2011 (UTC) | ||
+ | |||
+ | : Figured if out, if anyone cares: add "-Dapple.awt.UIElement=true" argument in BattleRunner.java. --[[User:Voidious|Voidious]] 18:11, 26 January 2011 (UTC) | ||
+ | |||
+ | == Play-It-Forward == | ||
+ | |||
+ | Hmm... I think I might just want to use this framework to test some new targeting ideas... Though I will need to update it to support PIF also. Perhaps the best way to do it would be a single row at the top of the CSV file, listing self/enemy positions/velocities. When I modify it do support this, want me to send the code? :) --[[User:Rednaxela|Rednaxela]] 04:48, 27 January 2011 (UTC) | ||
+ | |||
+ | Sure! I might even try it out myself if you do. =) --[[User:Voidious|Voidious]] 13:24, 27 January 2011 (UTC) | ||
+ | |||
+ | It strikes me that _classifier.feed(wave); and _classifier.classify(wave); currently aren't set up an accurate way. It looks to me like the classifier is being fed *every* wave before the one it's classifying, but I'd say this is notably inaccurate, because not all of those waves have finished yet. This could affect the results noticeably I think. It'll take a little more refactoring than I expected to correct for factors like this. --[[User:Rednaxela|Rednaxela]] 14:21, 27 January 2011 (UTC) | ||
+ | |||
+ | Hmm? It's not like that. Each wave also includes the ID of the last wave collected before it was aimed. So it only feeds the classifier the waves it had seen before trying to classify that wave. WaveReader stores two positions - the firing wave position and the reading/feeding position. --[[User:Voidious|Voidious]] 14:28, 27 January 2011 (UTC) | ||
+ | |||
+ | Oh, well... that's what I get for skimming over the code in the early morning I suppose. Let's see how this goes... --[[User:Rednaxela|Rednaxela]] 00:25, 28 January 2011 (UTC) |
Latest revision as of 19:25, 8 September 2011
General
Still ironing out some issues here and there, but damn this is cool. =) Time to run normal battles, 20 seasons x 48 bots: 4.75 hours (on 2 threads). Time to run the same gun against the raw data: ~10 minutes. :-D Plus you don't have to hope randomness averages out over the battles - it's the same data every time. --Voidious 23:37, 13 March 2010 (UTC)
Neat stuff here! Actually, back when working on RougeDC, I once had something akin to this set up for quick testing, but I never really used it extensively or made it robust. I wonder if I should set up a robust framework for this for my future targeting experiments. --Rednaxela 23:52, 13 March 2010 (UTC)
I actually wondered if you ever had. =) It's a funny combination of "wow this is so cool!" and "you know this is sooo nothing special." Back when I had access to MATLAB at school, I did play with a wave data set with some SVMs, but other than that I haven't explored testing my classification algorithms outside of Robocode. But I still have the desire to try a lot of clustering experiments, so taking a few days to set this up was well worth it! --Voidious 23:59, 13 March 2010 (UTC)
This has got me thinking. Since the earliest days of Ugluk, the design of the guns and movement have been 'pluggable'. Which is handy because I'd often throw a large set of both against opponents and simply stop using the ones that were least effective. Anyway.. digressing too much.. what I have not yet done is to make the tank completely independent of Robocode, such that with the right input you could run a simulation outside of the client. I can see the benefit of doing this with a recorded set of tank positions, directions, and speeds. Even putting aside the nagging problem of adaptive movements, you can quickly tell if your gun has gone horribly wrong. And of course when testing against non-adaptive movements, you can refine your punishment to squeeze the best point ratios out of your battles, which is what the scoring in the rumble is all about. Defeating good / adaptive bots is secondary. --Martin 21:11, 15 March 2010 (UTC)
QT Clustering sounds interesting. Reminds me of my density suggestion, except without the normal distribution. I wonder if there is a way to dynamically determine the best threshold as well. I would guess 'the point where the density of point to distance becomes less than those nearer to the center' but that is a bit abstract and is useless for building clusters, since not all points or for that matter clusters (with real data) fit this kind of definition. --Chase 10:33, 20 March 2010 (UTC)
My recent gun tests have been against a field of 47 mid-range bots I had in an old RoboResearch test bed. Last night before bed, I took 5 minutes and used BedMaker to create a test bed of 250 bots that Diamond scores between 70% and 90% against, then started collecting 6 seasons of gun data (1500 battles) against them with TripHammer RES. I felt so cool! =) --Voidious 16:15, 25 January 2011 (UTC)
Interesting idea here. Might convince me to dust off my bots and give things another go. --Miked0801 13:58, 16 April 2011 (UTC)
- Cool - would of course welcome any feedback you have on usability. Btw it would require some updates to support PM "correctly". Right now, data is only fed to classifiers when a wave is collected. For PM you'd probably just want each TickRecord fed to the classifier as soon as it happens, and the wave collection could be ignored altogether. And you'd want abs heading instead of relative heading, but that could be deduced from relative heading + orbit direction + sign(velocity). Those would all be pretty simple changes so maybe I'll hack that together shortly for 2.0.1. --Voidious 00:34, 17 April 2011 (UTC)
- I've added the TickClassifier and also the raw heading to the file format. Probably won't rush it out yet so I can gather a new batch of sample data and write an example PM using TickClassifier, but if you want the dev version (you'd need to collect some data yourself), let me know and I'll gladly post it before the official release. --Voidious 03:32, 17 April 2011 (UTC)
Compression?
Hmm, since you end up with huge CSV files with this, why not do some compression with GZIPOutputStream? Not only would it save disk space, but I have a feeling it could make WaveSim run faster due to reduced reading from disk. --Rednaxela 06:18, 26 January 2011 (UTC)
Good call, trying it now with some existing data. First run was 889 uncompressed, 904 gzip - hmm. Still worth keeping for the size, but no speed increase. Started to think just last night how unoptimized this code might be. I added regular MEA and corresponding regular GFs last night and then wrote a method to massage my existing data files into the new format. It was taking forever! Turns out a zillion string appends in a tight loop is bad, and StringBuilder is awesome.
Now I'm wondering if there are some magical incantations for reading these files better. I was using BufferedReader(FileReader()), now it's BufferedReader(InputStreamReader(GZIPInputStreamReader(FileInputStream)))). I tried (when still uncompressed) FileInputStream and reading in the whole file at once, then parsing it, but that was slower.
--Voidious 13:35, 26 January 2011 (UTC)
Oh, another issue might be appending to a compressed file. I append at the end of each round because I hit some issues storing up this huge file to write in one burst at the end of the battle. Maybe I'll try all this on a newer Robocode and see if I can figure that issue out too. --Voidious 13:41, 26 January 2011 (UTC)
I'd consider reading it all at once, to be expected to be slower. After all, when you only read it part at a time, it gives the hard disk a chance to do readahead while you're processing the data. I don't think the InputStreamReader+FileInputStream makes any difference vs FileReader.
Regarding appending, I believe you can append gzip streams actually. The command line gzip tools seem happy with this anyway. When I do "echo foo | gzip >> blah.gz" and "echo foo | gzip >> blah.gz", then try to read it with "cat blah.gz | gzip -d", it outputs both just fine. I haven't tested, but I presume GZIPInputStream can handle appended streams too. Even if that appending doesn't work, why not just keep the file open across rounds, as a static variable? --Rednaxela 13:59, 26 January 2011 (UTC)
Cool, this seems to be working, just leaving the output stream open now. I was hitting a "bad file descriptor" after a few minimized rounds, but switching back to Apple JVM (from SoyLatte) fixed it. Some of my data files collected with RoboResearch are hitting an "unexpected EOF" when I process them, but haven't reproduced it with data collected through normal Robocode, so I'm not sure what's going on. It seems to be able to read the data up to that point though. (Using Apple JVM and Robocode 1.7.2.2 for both.) Gonna try to get all of this posted today - with compressed data, normal MEA / wall distance, and random movement in the collector bot, all of which are done but just need testing and polishing. --Voidious 15:38, 26 January 2011 (UTC)
Figured it out - my little "go back to Apple JVM" stuff in my RoboResearch shell script wasn't really working, since RoboResearch doesn't use the Robocode control API, it actually launches Java itself. And now I remember why I hate the Apple JVM for RoboResearch - it launches a new "Java app" thing in the dock every battle and steals the focus from your current app. Time to figure out how to tweak that... I hope... Good news is I think this is almost good to go. =) --Voidious 16:49, 26 January 2011 (UTC)
- Figured if out, if anyone cares: add "-Dapple.awt.UIElement=true" argument in BattleRunner.java. --Voidious 18:11, 26 January 2011 (UTC)
Play-It-Forward
Hmm... I think I might just want to use this framework to test some new targeting ideas... Though I will need to update it to support PIF also. Perhaps the best way to do it would be a single row at the top of the CSV file, listing self/enemy positions/velocities. When I modify it do support this, want me to send the code? :) --Rednaxela 04:48, 27 January 2011 (UTC)
Sure! I might even try it out myself if you do. =) --Voidious 13:24, 27 January 2011 (UTC)
It strikes me that _classifier.feed(wave); and _classifier.classify(wave); currently aren't set up an accurate way. It looks to me like the classifier is being fed *every* wave before the one it's classifying, but I'd say this is notably inaccurate, because not all of those waves have finished yet. This could affect the results noticeably I think. It'll take a little more refactoring than I expected to correct for factors like this. --Rednaxela 14:21, 27 January 2011 (UTC)
Hmm? It's not like that. Each wave also includes the ID of the last wave collected before it was aimed. So it only feeds the classifier the waves it had seen before trying to classify that wave. WaveReader stores two positions - the firing wave position and the reading/feeding position. --Voidious 14:28, 27 January 2011 (UTC)
Oh, well... that's what I get for skimming over the code in the early morning I suppose. Let's see how this goes... --Rednaxela 00:25, 28 January 2011 (UTC)