Stuck in Loop Errors

A number of teams have experienced an error in which their robot stops (sometimes in the middle of a match) and displays OpMode stuck in loop(); Restarting robot controller app. The short answer: upgrade to version 2.62 of the FTC App SDK (here’s how). To understand what’s going on, we need to dive into the SDK a bit.

If you have version 2.62 or greater and you still see these errors, take a good look at your code. Is there a loop that takes too long to respond? Are you wait()-ing on something? These things could cause the watchdog timer to end the program.

Feed the Watchdog

The robot controller application implements something called a watchdog timer. It’s a safety feature: if a running program goes too long without responding, the watchdog will automatically end the program and display the “stuck in loop” error you might have seen. This can be good and bad.

We can see how it works in Android Studio by going to the following (when we select “Android” in the Project sidebar): TeamCode/jniLibs/RobotCore-release-sources.jar/com.qualcomm.robotcore/eventloop/opmode/OpModeManagerImpl

Here’s the method that actually runs our init, loop, runOpMode, etc. methods:

The SDK calls detectStuck() and passes it a timeout (usually 5 seconds; 1 second for the stop() method), the name of the method running (for use when displaying an error), and an anonymous function that runs the method. An example call is below:

Obviously, when it works correctly, this is great for stopping infinite loops and other issues that might cause a robot to run out of control. When it works incorrectly, robots seem to stop for no reason.

USB Communication Timeouts

Diagnosing why OpModes are timing out “for no reason” is tricky. One way to go about it is looking at the robot controller’s logs. Here’s an excerpt (with some intermediate lines removed) from one team that experienced the error during several matches:

There’s a lot to unpack here. When the watchdog timer decides to end the program, it logs an error and the current state of each thread.

Thread, the short version: the robot controller needs to do a lot (get information from the driver’s station, talk with hardware, etc.) and sometimes one task will take a while, so it splits the execution of these tasks into separate mini-programs that can sit in the background when they aren’t doing anything.

In this excerpt I’ve only included information about the thread titled “opmode loop()” which is where the team’s code is running. What you see is a stack trace.

Stack trace, the short version: imagine MyOpMode is running its loop() method and it calls servo.setPosition(0.5). Somewhere in the phone’s memory there is a block that says “here’s information related to the current running MyOpMode.loop()”. Right above it, there’s another that says “here’s information about the servo.setPosition() it called”. As different methods call one another, this “stack” of memory builds. A stack trace displays the list of items in the stack.

At the top of the stack, we see java.lang.Object.wait(). This is suspicious, isn’t it? We know the watchdog timer ended the program because it was taking too long to respond, and right there at the top of the stack trace is a wait() call. But, why was it waiting? If we look down a little further, we see a call to com...ModernRoboticsUsbController.waitForCallback(). At the time the watchdog timer killed the program, our OpMode was waiting for one of the USB devices to do something. Looking down just a little further, we can see what: com...ServoImpl.setPosition(). The team called setPosition() on one of their servos, and (it appears) the USB servo controller took so long to respond that the watchdog bit.

An astute reader may have jumped to a similar conclusion just by reading the first line of the excerpt. The warning (could not read Modern Robotics USB Core Device Interface Module [serial number]: comm timeout) suggests that such a delay in communication indeed happened. (It makes sense that the team might use the PWM ports on their Core Device Interface module to drive the servo instead of a servo controller.)

Fixing the Issue

Version 2.62 of the FTC App SDK includes the following note in its release notes:

Changes to enhance Modern Robotics USB protocol robustness.

Indeed, the development team worked with Modern Robotics to resolve issues with communication between the robot controller and various USB devices. The result is fewer comm timeouts, and therefore fewer reasons for the watchdog timer to act.

AJ Foster

AJ is a Field Technical Advisor in Orlando, FL. He enjoys teaching concepts related to the FIRST Tech Challenge, helping teams at competitions, and making the things he learns accessible to everyone.