ThreadDeath problem
← Thread:Talk:RoboRumble/ThreadDeath problem/reply (14)
For your questions:
- I/O robots (bots that read/write from the data directory) get 240 skipped turns. Apparently a robot can get marked as an I/O robot just by calling
getDataDirectory()
. - This is where things stop making sense. The stacktrace shows
RobotPeer.checkSkippedTurn()
callingBasicRobotProxy.forceStopThread()
, but the code shows that it can't happen without going throughprintln("SYSTEM: ...")
, there literally is no other code path. Areprintln()
calls getting silently dropped? - The Robocode main thread can't "skip turns" because it controls the turn counter.
Relevant code:
public void checkSkippedTurn() {
// Store last and current execution time for detecting skipped turns
lastExecutionTime = currentExecutionTime;
currentExecutionTime = battle.getTime();
int numSkippedTurns = (currentExecutionTime - lastExecutionTime) - 1;
if (numSkippedTurns >= 1) {
events.get().clear(false);
if (isAlive()) {
for (int skippedTurn = lastExecutionTime + 1; skippedTurn < currentExecutionTime; skippedTurn++) {
addEvent(new SkippedTurnEvent(skippedTurn));
println("SYSTEM: " + getShortName() + " skipped turn " + skippedTurn);
}
}
if ((!isIORobot && (numSkippedTurns > MAX_SKIPPED_TURNS))
|| (isIORobot && (numSkippedTurns > MAX_SKIPPED_TURNS_WITH_IO))) {
println("SYSTEM: " + getShortName() + " has not performed any actions in a reasonable amount of time.");
println("SYSTEM: No score will be generated.");
setHalt(true);
waitWakeupNoWait();
punishBadBehavior(BadBehavior.SKIPPED_TOO_MANY_TURNS);
robotProxy.forceStopThread();
}
}
}
Here is my current best guess.
Note that Beaming's stacktrace includes:
... at net.sf.robocode.host.proxies.BasicRobotProxy.forceStopThread(BasicRobotProxy.java:44) at net.sf.robocode.battle.peer.RobotPeer.checkSkippedTurn(RobotPeer.java:649) ...
Therefore, RobotPeer.checkSkippedTurn()
called BasicRobotProxy.forceStopThread()
The source code of RobotPeer.checkSkippedTurn()
shows one and only one possible call to BasicRobotProxy.forceStopThread()
:
public void checkSkippedTurn() {
// ...
if (/* snip */) {
println("SYSTEM: " + getShortName() + " has not performed any actions in a reasonable amount of time.");
println("SYSTEM: No score will be generated.");
/* snip */
robotProxy.forceStopThread();
}
}
Therefore, control flow must have passed through this path.
Note that this code sequence includes a println("SYSTEM: <Bot> has not performed any actions...")
call, which must have been executed, yet no such SYSTEM message is present in Beaming's log.
The source code of the println()
method:
public void println(String s) {
synchronized (proxyText) {
battleText.append(s);
battleText.append("\n");
}
}
Note the following:
- Robocode is multithreaded. Heavy multithreading causes race conditions.
- The variable
battleText
is aStringBuilder
, which is not thread-safe (unlike aStringBuffer
), which is. - The code modifies
battleText
, but only synchronizes againstproxyText
.
Therefore, EvBotNG is skipping turns, but no log messages (from either Robocode or the bot) are printed, because they are silently dropped due to race conditions in the println()
method. The only thing you ultimately see is the ThreadDeath exception caused by Thread.stop()
forcibly terminating the thread.
Presumably, this occurs because of unlucky interaction between RobotPeer.println()
and RobotPeer.readOutText()
.
readOutText() | println() ============================================= | =============================== final String robotText = | battleText.toString() + proxyText.toString(); | | battleText.append(s); | battleText.append("\n"); battleText.setLength(0); | proxyText.setLength(0); | return robotText; |
Therefore, the solution is to add synchronized(battleText) {
guards around readOutText()
and println()
.
Or maybe I'm way off track.
I think you're on the right track, it definitely feels like race conditions to me. However in this case you're wrong - both code blocks (println and readOutText) are synchronized with the same object, so they can't run at the same time. It isn't important which one they are synchronized with, it could even be a different object. See the sharedLock example here.
I think a good place to start looking is in the 'fixes' to the skipped turn behavior, which were documented here, as well as a very busy discussion on the wiki.
Well, println()
is definitely broken somehow. There is no way that checkSkippedTurn()
could have called forceStopThread()
(as is shown in the stacktrace), without going through println()
first, yet the print message never made its way to the console (as is shown in Beaming's log).
battleText
is also cleared in initializeRound()
and cleanup()
, both of which have no synchronization guards at all.
I think skipped turns are working as they should, they're just not being reported.
Good catch. I've added synchronization guards on the initializeRound() and cleanup() and now I get the SYSTEM: Neuromancer 5.4 has not performed any actions in a reasonable amount of time.
printed inside the correct round. The ThreadDeath is still printed in the round after, but I can live with that I guess.
Now for the next question, why is this even happening?
You do not have permission to edit this page, for the following reasons:
You can view and copy the source of this page.
Return to Thread:Talk:RoboRumble/ThreadDeath problem/reply (24).
It seems that Skilgannon has filed a bug report on this issue, including the necessary patch.
To add to mysteries. Why I do not see dramatic drop in scores if I run Voidious' RoboRunner? I see missing SYSTEM messages in logs but it does not affect scores the same drastic way as with the literumble client. I also see no ThreadDeath messages. I am guessing RoboRunner catches those somehow and process the score differently.