Detections on the PYNQ-Z2
Last updated
Last updated
As mentioned earlier, to evaluate the YOLO model, we will need a detection's set and a ground truths set. In practice, this means one or more files that contain information about the objects on each image.
Before anything else, we will be needing a vast set of images so that the evaluation covers more scenarios and is statistically more correct. As used last time for the quantization, we will resort to the COCO dataset to get the validation images and annotations. The validation images from COCO are about 5 thousand and the annotations file is the same folder as the training dataset with the extension ".json". The annotations of the dataset represent the ground truth.
You can download the COCO validation dataset on the Downloads page of the official website. Look for the "2017 val images" link and be ready to have 1GB of free space for the download. You can also use these links to download the files directly:
COCO validation images: http://images.cocodataset.org/zips/val2017.zip
COCO annotations: http://images.cocodataset.org/annotations/annotations_trainval2017.zip
So, on this phase, the objective is to obtain detection's from the YOLO model on the PYNQ-Z2 for the 5k images of the COCO validation dataset and merge with the ground truth annotations for the same dataset. In the end we will provide the detection's annotations and the ground truth annotations to a program that will evaluate the metric results for the YOLO model. The following scheme describes this development flow:
To obtain the detection's for the YOLOv3 model on the PYNQ-Z2 we only need to modify the c++ program to instead of drawing the bounding boxes, class and probability, write them on a file relative to each image. In other words, we need to modify the script so that the program walks trough the 5 thousand images one by one and creates a text file with the annotations for the bounding box coordinates, the class and the confidence score. The text file should have the same name as the image so that it's analysis is easier for the program. The next scheme describes the process:
To transfer this process to code might be a little difficult. To start, there are some challenges regarding the opening of the images, the order and the creation of the text files. Also, we need to take into account the possibility of failure halfway through the process and the format of the annotations.
Before anything else, we want to solve the text files format problem. We need to define this beforehand so the we build the program according to the choice. We could use the ".json" format as in the COCO dataset but it reveals to be a little more complicated to work with than I anticipated. There is also the YOLO format which fits the text file but the Metrics program we will be using later doesn't work with this format either. To obey the specifications of the metrics program, we will be using the following annotations format (per object):
You should note that these fields only accept absolute numbers (without comma). Only the probability can be expressed as a relative number.
I developed the following flowchart to represent the algorithm behind the program responsible to generate detection files for the validation images on the PYNQ-Z2:
You can see that there will be two cycles: one to process batches, in other words, sets of images, and other to process each image of each set. When all the batches are processed the program stops. The number of images per batch is hard-coded but should be defined by the user and it depends on the way the DPU handles the process. If the batch is too big, the probability of the DPU failing is greater or is more problematic.
If the process stops halfway, the user has to modify the starting batch to the batch number of when the interruption happened. Unfortunately, I found my program crashing way too many times for reasons I couldn't understand and so the process of manually changing the batch number can be time consuming, considering the number of images. For that reason, I created a python script that monitors the terminal output. When the script detects a DPU Timeout, it extracts the batch it stopped, changes the program to that batch, compiles the script and starts the process again. The following diagram explains it better:
The next image explains more practically what will happen on the terminal when the DPU timeout occurs. The python script is called monitor.py and is located on the yolo_pynqz2_data folder.
With all that said, you should add a folder called "images" with the 5000 COCO validation dataset images and create a empty folder named "labels" or "labels_yolo" to the yolo_pynqz2_data folder. It should look like this in the end:
Then, you will need to paste this folder on the PYNQ-Z2 environment. Simply drag it to the MobaXTerm left part. I recommend you use the monitor.py script and not the yolo executable for the YOLOv3 as it is prone to have DPU timeouts.
For both the YOLOv3 and Tiny YOLO the program is the same but on the second one the probability of failure is much smaller. The monitoring script is dedicated for the YOLO program.
At the end of the process you will obtain a folder with 5 thousand text files with the same names as the images each one refers to. Inside there will be annotations for each object on the format we talked about earlier.
I recommend you rename the labels folder to labels_yolo and you copy that to a PEN drive. We will be needing those annotations on the Ubuntu OS but you can work the next part on Windows if you have free space.
These annotations will be very important to supply the program that will process and compare them with the ground truth annotations for the same set of images. I should mention that we could make inference on a smaller set of validation images (1000 for example) but the results would be less representative statistically. The monitoring script ensures that the process automatically starts in case of failure so its a question of time before having all the detection's.