TensorFlow for CENTERSTAGE presented by RTX — FIRST Tech Challenge Docs 0.2 documentation (2024)

What is TensorFlow?

FIRST Tech Challenge teams can use TensorFlow Lite, a lightweight version of Google’sTensorFlow machine learning technology thatis designed to run on mobile devices such as an Android smartphone or the REVControl Hub. A trainedTensorFlow model was developed to recognize the white Pixel game piece used inthe 2023-2024 CENTERSTAGE presented by RTX challenge.

TensorFlow Object Detection (TFOD) has been integrated into the control systemsoftware to identify a white Pixel during a match. The SDK (SDKversion 9.0) contains TFOD Sample OpModes and Detection Models that canrecognize the white Pixel at various poses (but not all).

Also, FIRST Tech Challenge Teams can use the Machine Learning Toolchain tool to train their own TFOD models. This allows teamsto recognize custom objects they place on Spike Marks in place of white Pixelsprior to the start of the match (also known as Team Game Elements). Thistraining should take into account certain conditions such as distance fromcamera to target, angle, lighting, and especially backgrounds. Teams canreceive technical support using the Machine Learning Toolchain through theMachine Learning Forum.

How Might a Team Use TensorFlow this season?

For this season’s challenge the field is randomized during the Pre-Match stage.This randomization causes the white Pixel placed on Spike Marks to be placed oneither the Left, Center, or Right Spike Mark. During Autonomous, Robots mustindependently determine which of the three Spike Marks (Left, Center, Right)the white Pixel was placed on. To do this, robots using a Webcam or a camera ona Robot Controller Smartphone can inspect Spike Mark locations to determine ifa white Pixel is present. Once the robot has correctly identified which SpikeMark the white Pixel is present on, the robot can then perform additionalactions based on that position that will yield additional points.

Teams also have the opportunity to replace the white Pixel with an object oftheir own creation, within a few guidelines specified in the Game Manual. Thisobject, or Team Game Element, can be optimized to help the team identify itmore easily and custom TensorFlow inference models can be created to facilitaterecognition. As the field is randomized, the team’s Team Game Element will beplaced on the Spike Marks as the white Pixel would have, and the team mustidentify and use the Team Game Element the same as if it were a white Pixel ona Spike Mark.

Sample OpModes

Teams have the option of using a custom inference model with the FIRST TechChallenge software or to use the game-specific default model provided. As notedabove, the FIRST Machine Learning Toolchain is a streamlined tool for trainingyour own TFOD models.

The FIRST Tech Challenge software (Robot Controller App and Android StudioProject) includes sample OpModes (Blocks and Java versions) that demonstratehow to use the default inference model. These tutorials show how to usethe sample OpModes, using examples from previous FIRST Tech Challengeseasons, but demonstrate the process for use in any season.

Blocks Sample OpMode for TensorFlow Object Detection
Java Sample OpMode for TFOD

Using the sample OpModes, teams can practice identifying white Pixels placedon Spike Marks. The sample OpMode ConceptTensorFlowObjectDetectionEasy isa simple OpMode to use to detect a Pixel - it is a very basic OpMode simplifiedfor beginner teams to perform basic Pixel detection.

It is important to note that if the detection of the object is below theminimum confidence threshold, the detection will not be shown - it is importantto set the minimum detection threshold appropriately.

Note

The default minimum confidence threshold provided in the Sample OpMode (75%)is only provided as an example; depending on local conditions (lighting,image wear, etc…) it may be necessary to lower the minimum confidence inorder to increase TensorFlow’s likelihood to see all possible imagedetections. However, due to its simplified nature it is not possible tochange the minimum confidence using the Easy OpMode. Instead, you willhave to use the normal OpMode.

Notes on Training the CENTERSTAGE Model

The Pixel game piece posed an interesting challenge for TensorFlow ObjectDetection (TFOD). As is warned in the Machine Learning Toolkit documentation,TFOD is not very good with recognizing and differentiating simple geometricshapes, nor distinguishing between specific colors; instead, TFOD is good atdetecting patterns. TFOD needs to be able to recognize a unique pattern,and while there is a small amount of patterning in the ribbing of thePixel, in various lighting conditions it’s dubious how much the ribbingwill be able to be seen. Even in the image at the top of this document, theribbing can only be seen due to the specific shadows that the game piece hasbeen provided. Even in optimal testing environments, it was difficult tocapture video of the object that nicely highlighted the ribbing enough forTensorFlow to use for pattern recognition. This highlighted the inability toguarantee optimal Pixel characteristics in unknown lighting environmentsfor TFOD.

Another challenge with training the model had to do with how the Pixellooks at different pose angles. When the camera is merely a scant few inchesfrom the floor, the Pixel can almost look like a solid object; at timesthere may be sufficient shadows to see that there is a hole in the center ofthe object, but not always. However, if the camera was several inches off thefloor the Pixel looked differently, as the mat or colored tape could beseen through the hole in the middle of the object. This confused the neuralnetwork and made it extremely difficult to train, and the resulting modelseventually recognized any “sufficiently light colored blob” as a Pixel.This was not exactly ideal.

Even with the best of images, the Machine Learning algorithms had a difficulttime determining what was a Pixel and what wasn’t. What ended up workingwas providing NOT ONLY images of the Pixel in different poses, but alsoseveral white objects that WERE NOT a Pixel. This was fundamental tohelping TensorFlow train itself to understand that “All Pixels are WhiteObjects, but not all White Objects are Pixels.”

To provide some additional context on this, here are a few examples of labeledframes that illustrate the challenges and techniques in dealing with thePixel game piece.

Training Frame 1

Pixel Saturation (No Ribs)

Using the Default CENTERSTAGE Model

In the previous section it’s described how the height of the camera from the floorhas a huge effect on how the Pixel is seen; too low and the object can looklike a single “blob” of color, and too high and the object will look similar toa white donut. When training the model, it was decided that the Donut approach wasthe best - train the model to recognize the Pixel from above to provide aclear and consistent view of the Pixel. Toss in some angled shots as well, alongwith some additional extra objects just to give TensorFlow some perspective, anda model is born. But wait, how does that affect detection of the Pixel from therobot’s starting configuration?

In CENTERSTAGE, using the default CENTERSTAGE model, it is unlikely that arobot will be able to get a consistent detection of a White Pixel from thestarting location. In order to get a good detection, the robot’s camera needsto be placed fairly high up, and angled down to be able to see the gray tile,blue tape, or red tape peeking out of the center of the Pixel. Thanks tothe center structure on the field this season, it’s doubtful that a team willwant to have an exceptionally tall robot - likely no more than 14 inches tall,but most will want to be under 12 inches to be safe (depending on your strategy- please don’t let this article define your game strategy!). The angle thatyour robot’s camera will have with the Pixel in the starting configurationmakes this seem unlikely.

Here are several images of detected and non-detected Pixels. Notice thatthe center of the object must be able to see through to what’s under thePixel in order for the object to be detected as a Pixel.

Non-Detected Pixel #1

Pixel Not Detected, Angle Too Low

Non-Detected Pixel #2

Pixel Not Detected, Angle Too Low

Detected Pixel #1

Pixel Detected, Min Angle

Detected Pixel #2

Pixel Detected, Better Angle

Detected Pixel #3

Pixel Detected, Min Angle on Tape

Detected Pixel #4

Pixel Detected, Top-Down View

Therefore, there are two options for detecting the Pixel:

The camera can be on a retractable/moving system, so that the camera is elevated toa desirable height during the start of Autonomous, and then retracts before movingaround.
The robot will have to drive closer to the Spike Marks in order to be able toproperly detect the Pixels.

For the second option (driving closer), the camera’s field of view might pose achallenge if it’s desirable for all three Spike Marks to be always in view. Ifusing a Logitech C270 camera, perhaps using a Logitech C920 with a wider fieldof view might help to some degree. This completely depends on the height of thecamera and how far the robot must be driven in order to properly recognize aPixel. Teams can also simply choose to point their webcam to the CENTER andLEFT Spike Marks, for example, and drive closer to those targets, and if aPixel is not detected then by process of elimination it must be on theRIGHT Spike Mark.

Selecting objects for the Team Prop

Selecting objects to use for your custom Team Prop can seem daunting. Questionsswirl like “What shapes are going to be recognized best?”, “If I cannot havemultiple colors, how do I make patterns?”, and “How do I make this easier on myself?”.Hopefully this section will help you understand a little more about TensorFlowand how to get the most out of it.

First, it’s important to note that TensorFlow has the following quirks/behaviors:

In order to run TensorFlow on mobile phones, FIRST Tech Challenge uses a very small coremodel resolution. This means the image is downscaled from the high definitionwebcam image to one that is only 300x300 pixels. This means that medium andsmall objects within the webcam images may be reduced to very smallindistinguishable clusters of pixels in the target image. Keep the objects inthe view of the camera large, and train for a wide range of image sizes.
TensorFlow is not really good at differentiating simple geometric shapes. TensorFlowObject Detection is an object classifier, and similar geometric shapes willclassify similarly. Humans are much better at differentiating geometric shapes thanneural net algorithms, like TensorFlow, at the present.
TensorFlow is great at pattern detection, but that means that within the footprintof the object you need one or more repeating or unique patterns. The larger thepattern the easier it will be for TensorFlow to detect the pattern at adistance.

So what kinds of patterns are good for TensorFlow? Let’s explore a few examples:

Consider the shape of a chess board Rook.The Rook itself is mostly uniform all around, no matter how you rotate theobject it more or less looks the same. Not much patterning there. However,the top of the Rook is very unique and patterned. Exaggerating the“battlements”, the square-shaped parts of the top of the Rook, can provideunique patterning that TensorFlow can distinguish.
Consider the outline of a chess Knight,as the “head” of the Knight is facing to the right or to the left. Thatprofile is very distinguishable as the head of a horse. That specific animalis one that model zooshave been optimized for, so it’s definitely a shape that TensorFlow can betrained to recognize.
Consider the patterning in a fancy wrought-iron fence. If madethick enough, those repeating patterns can be recognized by a TensorFlowmodel. Like the Chess Board Rook, it might be wise to make the object roundso that the pattern is similar and repeats now matter how the object isrotated. If allowed, having multiple shades of color can also help make amore-unique patterning on the object (e.g. multiple shades of red, likelymust consult the ).
TensorFlow can be used toDetect Plantsand all of the plants are a single color. Similar techniques can be reverse-engineered(make objects of different “patterns” similar to plants) to create an object thatcan be detected and differentiated from other objects on the game field.

Hopefully this gives you quite a few ideas for how to approach this challenge!

Using Custom TensorFlow models in Blocks and Java

Instructions on using Custom TensorFlow Models in Blocks, OnBot-Java, andAndroid Studio can be found in the FTC-ML documentation,in the Implementing in Robot Code section.