Final results

Performance


As we seen before on the deployment of the YOLOv3 model on the PYNQ-Z2, the inference is quite a fast process. It takes less than a second to detect and even draw the bounding boxes on the image. This all looks very promissing but how does it compare to the performance of a normal computer?

First, we should use the same image on all the tests. The ideal image would be the infamous image with the dog, the bicycle and the pick-up truck. This will make sure that there are no disparities when it comes to processing the image. Wouldn't be fair if the post-processing had more pixels in one case than the other.

To make the comparison, I ran the YOLOv3 model on my computer using the Darknet API. You can find the details on the YOLO website but in short, you just have to download Darknet and the pre-trained YOLOv3 model. Then you execute the detection command to make inference on the image and the result will be the image with the detections and the time it took to process the image.

//Download and compile Darknet
git clone https://github.com/pjreddie/darknet
cd darknet
make

Then we download the weights file for the YOLOv3 and execute the inference:

wget https://pjreddie.com/media/files/yolov3.weights
./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg

This was my first result:

My computer doesn't have a graphics card so the processing is all a job for the AMD Rayzen 5 3500U CPU. With a GPU the results would be much different but unfair.

As you can see, the computer took on average 19 seconds to predict the objects on the test image! This is result is 23 times slower than the time it took on the PYNQ-Z2 which makes it a very good result.

I should note that the YOLOv3 network used on the CPU is obviously more complex as it wasn't compressed. Maybe with the quantized network the results would be a lot better.

Accuracy


As we saw earlier, the YOLO model is a lot faster on the PYNQ-Z2 than on the CPU. But what about the accuracy? Will it be a lot less than the original model?

It is a fact that a compressed network will lose accuracy because the parameters are simplified. But, the trade for inference speed should not be so high because a model with low accuracy isn't very useful.

We saw on the previous chapter how to obtain some metrics for the YOLO model running on the PYNQ-Z2. We will now compare those results with the official YOLOv3 metrics that you can consult on this amazing paper written by Joseph Redmon and Ali Farhadi. Take a look at this tale:

PlatformAPAP50AP75APSAPMAPLmAP

YOLOv3

33,00

57,90

34,40

18,30

35,40

41,90

55,30

PYNQ-Z2 YOLOv3

21,41

40,10

20,53

6,07

20,46

32,19

40,36

As you can see, there was a little decrement on the accuracy of the model on the PYNQ-Z2. But, the decrement wasn't very big and the results still make the model very accurate both teorically and pratically. Also, the smaller accuracy is not proportional to the inference speed. In other words, the YOLOv3 model on the PYNQ-Z2 is 23 times faster and 27% less accurate than the original YOLO model.

This goes to show that FPGA's are indeed a really good way to embed a Neural Network. The Hardware maked the execution time of the Neural Network really fast and the accuracy still very acceptable. Also, the PLD has a really low power consumption - about 4,166W which is less than some LED lightbulbs!

Last updated