Varying NUMBATTLES of RoborumbleAtHome?
Recently, I noticed that more than half of the battles are dropped as queue is full — however, this won't happen even if I wait a few minutes. Seems that all the rumble clients are uploading battles periodically, and that upload is pretty concentrated — e.g. All four clients of mine upload ~200 battles within ~3 minutes, which makes the queue get full immediately. And If I take a look at literumble/statistics, I can see that there are 5 to 7 clients uploading within 2 minutes.
It generally takes a client about 15 min to finish 50 battles, but if we vary this to primes, the uploads will get evenly distributed, reducing the high concurrent which causes a lot of dropped battles.
Reducing NUMBATTLES would probably help here too. It would also reduce the delay which is the main cause of duplicated pairings for new bots being entered. Maybe a NUMBATTLES of 20 in the main rumble would be good enough to solve the client component of this.
However, I think one of the main causes of the full queue is the batch processing for Vote/NPP/KNNPBI, since the queue needs to paused while this is running. Because it is paused the projected processing time goes very high, and it stops accepting new uploads. I have an idea on how to tune this, it should help a bit.
However, even a NUMBATTLES of 3 can't prevent most of the battles from being dropped ;/
Seems that with 8 clients running the rumble at the same time, no attempt will help without stopping some clients.
Worth mention that I can notice dropped battles when there are 6 clients, also not frequently. Seems that with 2 more clients, the effectiveness dropped considerably?
Btw, one thing that's really interesting is that the duplicates of multiple versions can last hours. Seems that some clients are not checking participants list for hours.
Got it — maybe after the queue is paused for batch tasks and then resumed, it keeps near full as there are still much parings uploaded. Like some DoS, this decreases the ability to handle high concurrent (although the average pairings uploaded per minute is not very high, they came in during a short period of time, and get dropped)
Then I think increase the queue size a little after batch task (and then decrease to normal size slowly to make sure new uploads won't wait forever after some flood upload)
Or, we can handle uploads during pause separately — don't let them take place in normal queue, rather, store them in a separate queue (and cap it with normal uploads per minutes * pause time).
I was running 8 clients, that was probably causing it. Particularly melee clients cause a huge number of uploads for the amount of processing time required by the client.
I'll save my clients for when there are less others running =)
I've been experiencing constant "queue full" messages in the past 2 hours in MeleeRumble, with 3 melee clients + 3 rumble clients. This should really be happening this often?
I noticed that every time the queue is paused for batch tasks, not until I pause the clients for a few minus, did the massive queue full messages stop.
That may because when the queue size is near max size, the capacity of handling high concurrency decreases dramatically, although the average processing power doesn't decrease at all.
Use a separate queue when it is paused may help, imo.
I can't think of anything that would cause that from the server side.
I think it is more a question of load. When the queue is full the clients stop running priority battles (since they don't get sent any) and as a result they run random battles, which on average are smaller, older and faster to run. This causes an increased load from the clients, since they are not only sending more battles to the server, but they are also random battles, so it is less likely for bots to be in cache. In addition it is slower to generate priority battles on random uploads. Thus we have the double effect of random battles causing the clients to speed up at the same time as the server slowing down.
Maybe it would help to generate double or triple the necessary number of priority battles while handling queued entries, and then pass those back to the clients even if the uploads are dropped. I know it is a bandaid, but the rumble only has a limited processing speed in the current design.
that sounds true — I've been noticing that some pairing is taking significant amount of time to process, e.g. 200ms to 500ms. that significant delay may decrease the processing power significantly.
And the setting of BATTLESPERBOT may be another reason to cause massive random battles. I've been setting that to ~5000, but I'm not sure whether all of my clients are updated.
Btw, it seems that RoboRumbleAtHome is very lazy in downloading new bots, I often have to restart them myself, or they will continue run pointless random battles (which slows the server down significantly) or battling old versions for hours.