Difference between revisions of "Talk:WaveSim"

From Robowiki
Jump to navigation Jump to search
(→‎Compression?: appending shouldn't be an issue)
(→‎Compression?: gzip stuff)
Line 26: Line 26:
  
 
Regarding appending, I believe you can append gzip streams actually. The command line gzip tools seem happy with this anyway. When I do "echo foo | gzip >> blah.gz" and "echo foo | gzip >> blah.gz", then try to read  it with "cat blah.gz | gzip -d", it outputs both just fine. I haven't tested, but I presume GZIPInputStream can handle appended streams too. Even if that appending doesn't work, why not just keep the file open across rounds, as a static variable? --[[User:Rednaxela|Rednaxela]] 13:59, 26 January 2011 (UTC)
 
Regarding appending, I believe you can append gzip streams actually. The command line gzip tools seem happy with this anyway. When I do "echo foo | gzip >> blah.gz" and "echo foo | gzip >> blah.gz", then try to read  it with "cat blah.gz | gzip -d", it outputs both just fine. I haven't tested, but I presume GZIPInputStream can handle appended streams too. Even if that appending doesn't work, why not just keep the file open across rounds, as a static variable? --[[User:Rednaxela|Rednaxela]] 13:59, 26 January 2011 (UTC)
 +
 +
Cool, this seems to be working, just leaving the output stream open now. I was hitting a "bad file descriptor" after a few minimized rounds, but switching back to Apple JVM (from SoyLatte) fixed it. Some of my data files collected with RoboResearch are hitting an "unexpected EOF" when I process them, but haven't reproduced it with data collected through normal Robocode, so I'm not sure what's going on. It seems to be able to read the data up to that point though. (Using Apple JVM and Robocode 1.7.2.2 for both.) Gonna try to get all of this posted today - with compressed data, normal MEA / wall distance, and random movement in the collector bot, all of which are done but just need testing and polishing. --[[User:Voidious|Voidious]] 15:38, 26 January 2011 (UTC)

Revision as of 17:38, 26 January 2011

Still ironing out some issues here and there, but damn this is cool. =) Time to run normal battles, 20 seasons x 48 bots: 4.75 hours (on 2 threads). Time to run the same gun against the raw data: ~10 minutes. :-D Plus you don't have to hope randomness averages out over the battles - it's the same data every time. --Voidious 23:37, 13 March 2010 (UTC)

Neat stuff here! Actually, back when working on RougeDC, I once had something akin to this set up for quick testing, but I never really used it extensively or made it robust. I wonder if I should set up a robust framework for this for my future targeting experiments. --Rednaxela 23:52, 13 March 2010 (UTC)

I actually wondered if you ever had. =) It's a funny combination of "wow this is so cool!" and "you know this is sooo nothing special." Back when I had access to MATLAB at school, I did play with a wave data set with some SVMs, but other than that I haven't explored testing my classification algorithms outside of Robocode. But I still have the desire to try a lot of clustering experiments, so taking a few days to set this up was well worth it! --Voidious 23:59, 13 March 2010 (UTC)

This has got me thinking. Since the earliest days of Ugluk, the design of the guns and movement have been 'pluggable'. Which is handy because I'd often throw a large set of both against opponents and simply stop using the ones that were least effective. Anyway.. digressing too much.. what I have not yet done is to make the tank completely independent of Robocode, such that with the right input you could run a simulation outside of the client. I can see the benefit of doing this with a recorded set of tank positions, directions, and speeds. Even putting aside the nagging problem of adaptive movements, you can quickly tell if your gun has gone horribly wrong. And of course when testing against non-adaptive movements, you can refine your punishment to squeeze the best point ratios out of your battles, which is what the scoring in the rumble is all about. Defeating good / adaptive bots is secondary. --Martin 21:11, 15 March 2010 (UTC)

QT Clustering sounds interesting. Reminds me of my density suggestion, except without the normal distribution. I wonder if there is a way to dynamically determine the best threshold as well. I would guess 'the point where the density of point to distance becomes less than those nearer to the center' but that is a bit abstract and is useless for building clusters, since not all points or for that matter clusters (with real data) fit this kind of definition. --Chase 10:33, 20 March 2010 (UTC)

My recent gun tests have been against a field of 47 mid-range bots I had in an old RoboResearch test bed. Last night before bed, I took 5 minutes and used BedMaker to create a test bed of 250 bots that Diamond scores between 70% and 90% against, then started collecting 6 seasons of gun data (1500 battles) against them with TripHammer RES. I felt so cool! =) --Voidious 16:15, 25 January 2011 (UTC)

Compression?

Hmm, since you end up with huge CSV files with this, why not do some compression with GZIPOutputStream? Not only would it save disk space, but I have a feeling it could make WaveSim run faster due to reduced reading from disk. --Rednaxela 06:18, 26 January 2011 (UTC)

Good call, trying it now with some existing data. First run was 889 uncompressed, 904 gzip - hmm. Still worth keeping for the size, but no speed increase. Started to think just last night how unoptimized this code might be. I added regular MEA and corresponding regular GFs last night and then wrote a method to massage my existing data files into the new format. It was taking forever! Turns out a zillion string appends in a tight loop is bad, and StringBuilder is awesome.

Now I'm wondering if there are some magical incantations for reading these files better. I was using BufferedReader(FileReader()), now it's BufferedReader(InputStreamReader(GZIPInputStreamReader(FileInputStream)))). I tried (when still uncompressed) FileInputStream and reading in the whole file at once, then parsing it, but that was slower.

--Voidious 13:35, 26 January 2011 (UTC)

Oh, another issue might be appending to a compressed file. I append at the end of each round because I hit some issues storing up this huge file to write in one burst at the end of the battle. Maybe I'll try all this on a newer Robocode and see if I can figure that issue out too. --Voidious 13:41, 26 January 2011 (UTC)

I'd consider reading it all at once, to be expected to be slower. After all, when you only read it part at a time, it gives the hard disk a chance to do readahead while you're processing the data. I don't think the InputStreamReader+FileInputStream makes any difference vs FileReader.

Regarding appending, I believe you can append gzip streams actually. The command line gzip tools seem happy with this anyway. When I do "echo foo | gzip >> blah.gz" and "echo foo | gzip >> blah.gz", then try to read it with "cat blah.gz | gzip -d", it outputs both just fine. I haven't tested, but I presume GZIPInputStream can handle appended streams too. Even if that appending doesn't work, why not just keep the file open across rounds, as a static variable? --Rednaxela 13:59, 26 January 2011 (UTC)

Cool, this seems to be working, just leaving the output stream open now. I was hitting a "bad file descriptor" after a few minimized rounds, but switching back to Apple JVM (from SoyLatte) fixed it. Some of my data files collected with RoboResearch are hitting an "unexpected EOF" when I process them, but haven't reproduced it with data collected through normal Robocode, so I'm not sure what's going on. It seems to be able to read the data up to that point though. (Using Apple JVM and Robocode 1.7.2.2 for both.) Gonna try to get all of this posted today - with compressed data, normal MEA / wall distance, and random movement in the collector bot, all of which are done but just need testing and polishing. --Voidious 15:38, 26 January 2011 (UTC)