how to build a good test bed?
Fragment of a discussion from Talk:ScalarBot/Version History
Jump to navigation
Jump to search
Well, this combination sounds great, and it is more like how I'm tuning weights by hand than traditional GAs. And this way it should work way better than hand, as it's running way more battles with way more population.
And it's way faster (and also with less deviation) with recorded battles. The only problem is overfitting the recorded battles, but that should be solved well with many tune–rerecord iterations.
Anyway, I'm still wondering about — will it forget the previous tune–rerecord iterations to overfit new iterations? Anyway, since it sounds more like metric learning, it won't surprise me if this one is different. Did you experiment rerunning the old battles after tuning for newer ones to see that?