Execute YOLOv3
Last updated
Last updated
It's time to add all the things together and execute the YOLOv3 Neural Network in order to perform inference on a desired image.
Previously, we've created a package containing the DPU elf file and other necessary files. We will now copy that package to the board's Operating System using the Xterm interface. The easier way to do it is to simply drag the file "yolo_pynqz2" to working area on the left side of the application.
The folder organization is shown on the next image. There are some folders separating different information's and files. For the deployment of the YOLOv3 on the PYNQ-Z2 we will focus on the Programs folder and on the Makefile.
The info folder contains those informational files that won't be used for anything. The model folder will be used to store the binary file with the YOLO instructions - the .elf file. The c++ programs with the interaction with the DPU will be a main take on this chapter as it is where every instruction is written in order to run inference on the DPU. Those files will be stored on the respective programs folder. Then there will be a test image called dog.jpg and a Makefile. The Makefile will be responsible to compile the code, include the libraries and check for errors. Finally the objects folder will store object files resulting from the compilation process. The executable file will be generated on the base location.
There are two c++ scripts on the programs folder. One is to perform detection's on a single given image and the other is the YOLO object detector in real time, using a USB camera. We won't focus on the video detection on this project but I will teach you how to activate it soon.
Once again the scripts will be absolutely based on Wu-Tianze's work and I only performed slight modifications. I won't get into much detail here because I think it is better if you experiment yourself with the script and modify certain parameters or play with the visualization. The only thing to note is that you don't need to focus so much on the first functions that are all related to the pre-processing. The most important functions are the main, the detect and the deal.
Analizing the main function on the yolo_image.cpp, you can see that after invoking and executing a DPU task, for the specific Neural network, there are called some image processing functions. Essentialy, the set_input_image function is responsible for the conversion and pre-processing of the image in order to respect the model requirements. The deal function has to draw bounding boxes on the respective objects detected on the image and return the edited image to the same variable. The function only knows where to draw the bounding boxes because it called the detect function to obtain the coordinates and the object class. After the processing phase, the image is shown on a window using the OpenCV library. The development flow can be described by the following diagram:
From the original program, the modifications consisted in changing the bounding boxe thickness, adding a vector containing all the 80 possible classes so we could associate the number to a string name and display the object class on the image, and lastly there were added some informations to be displayed on the Termnal, like the execution time of the different processes and the classes identified on the image.
If we ignore the outrageous timers along the way, the order of events should be like this:
There is a Makefile included on the folder. This file has the responsibility to compile the code and include the necessary libraries. This means that it will check for errors on the code and it will call the functions needed like OpenCV and the DNNDK API.
Let's take a brief look at the makefile content:
I also don't understand almost anything here. Go ask chatGPT if you need details!
The only important thing to know is that on line one you can define the name of the executable generated after the compilation process. Also, on the list of source files you have to indicate what is the source script to compile. So if you want to compile the yolo_video you have to write it on that line.
This is it. The moment you've been waiting for! You want to see the YOLO algorithm detecting objects on a image. On the folder there is a test image called "dog.jpg" which can be the first test subject but later you can add any image you want and perform detection on it.
First we have to compile the yolo_image script. The makefile is already set to do that so you just have to execute make. Then the object files will be placed on the objects directory and a executable file named "yolo_image" will be generated. You just execute the script and indicate which image you want to perform detection's on.
This should result on the following window:
There will be a error about accessibility bus address which you have to ignore. From my knowledge, I don't think it affects in any way the performance of the model.
The inference time will be about 0,45 seconds, which is really fast. The whole process will take approximately 0,8 seconds as the c++ functions will consume some precious time. Still, the result arrives in less than a second and that is a positive result. On the future, those functions executed in software might be also accelerated by the FPGA!
The next image display's the results in terms of time for the YOLO and the Tiny YOLO model (Which you can also try). The Tiny YOLO is more than 10 times faster on the inference but the accuracy will be smaller.
As you can see the model is fairly accurate. But how accurate? - you might ask. Well, I measured and mAP of 0,4036. This value is not bad for a compressed network and you can obtain performance metrics as this one from your model on the next chapter.