Introduction
Last updated
Last updated
Hello everyone! Here I am going to introduce you the development flow of the project. I will have to start by introducing various important concepts and then I will tie them together so you can understand how all of this is going to work. Don't worry if you don't understand everything now because I will reach over this topics on each individual chapter again. Remember, this is a step-by-step tutorial and you don't really need any specific knowledge before hand.
Deep Learning is a sub-area of AI (Artificial Intelligence) that pretty much mimics the operation of the Human brain to create algorithms that are able to learn and execute complex tasks. You heard it right, we will be working with AI!
The biggest difference between Deep Learning and other models is the architecture of the Neural Networks used. These networks use multiple layers of artificial neurons that are processing units that mimic biological neurons. As the data flows across the layers the system is capable of learning hierarchical representations of the data, capturing more and more abstract features.
So basically it's magic... The computer simulates a bunch of neurons, which are nothing more than a lot of simple operations over and over again and each pixel of the input image flows trough a lot of those neurons adjusting the connections of those neurons each time so in the end you have a Neural Network specially designed to do something like object detection. The following diagram describes more or less that.
The dots/ balls represent neurons (mathematical operations), the squares represent pixels of a image and the connections are called weights and their values change as the Network is trained.
The picture represents a oversimplified diagram of a Deep Learning Neural network. Just note that there are a lot more hidden layers that represent various neurons with different types of operations.
Also note that the ridiculously small input image might represent a picture of a person in a grass landscape with blue sky and a sun. Yes, the two grey dots on the output image represent the person and the red ones represent the bounding box... Don't judge me, it took 20 minutes to make that diagram and if the image had better resolution it would take a lot more time!
Concept check:
A Programmable Logic Device is a class of semiconductor devices that revolutionized digital electronics and engineering of embedded systems. They are essentially integrated circuits that combine multiple Hardware components including CPU's, memory, communication interfaces, digital logic in only one chip - SoC (System on Chip). This of functionalities allows the development of highly customized complex systems, opening the gates for a broad variety of applications in electronics. Nowadays, a SoC type PLD includes a ARM CPU, a memory and a FPGA.
The PYNQ-Z2 board is a Xilinx development board that takes part on the PYNQ family (Python Productivity for Zynq). This board is the one recommended by Xilinx to get started on the FPGA development because of its easy interaction with the Zynq-7020 (FPGA) using python programming language. It contains a dual core ARM Cortex A9 processor that can communicate with the Zynq-7020 and can execute some programmed functions in software. The board also has 512MB of general purpose DDR3 RAM and 128MB of RAM on the Programmable Logic for more versatility. Let's make an important distinction here:
PL - Programmable Logic - Represents the part of the chip that contains the FPGA (Field Programmable Gate Array). On the PYNQ-Z2 that refers to the Zynq-7020.
PS - Programmable System - Refers to the part of the SoC that contains the processor and the general processing resources. The PS is where the Operating System and the high level code is executed.
You can notice on the image that the connection between the PL and PS is done by the ARM interface protocol AXI (Advanced Xtensible Interface). Also, there is a Micro-Blaze processor that is nothing more that some logic units from the PL that help the main ARM processor executing more simple tasks.
Concept check:
The DPU is the DeepLearning Processing Unit and it is a special piece of Hardware that can be implemented on the FPGA (PL side) and it is able to process various types of Deep Neural Networks.
The DPU feels like cheating. You should not combine two words on the same letter on an abbreviation!
This special Hardware has different architectures to address different Xilinx boards. For boards with a Zynq-7000 series FPGA there will be more limitations than, for example, a Zynq UltraScale+.
Neural Networks like YOLOv3 can be implemented on the DPU but there needs to be a special configuration file communicating with it. This file indicates the DPU to create the specific layers of the Neural Network.
Concept check:
To develop applications on the PYNQ-Z2 or access specific functionalities, you need to access various Xilinx tools. Let's introduce some of them.
Vivado is a suite containing various tools that allow to develop Hardware to be used on the PL of Xilinx FPGA's. You can essentially create hardware in 3 ways:
High-Level Syntesis (HLS) - Vivado HLS is a tool that allows to create hardware using high level programming languages like C, C++ or OpenCL. This allows the developer to express desired functionalities at a more abstract level, making the hardware development process easier and more accessible.
Hardware Description Languages (HDL) - The traditional approach on hardware development consists of writing code in Hardware description languages like VHDL or Verilog. Vivado offers a Integrated Development Environment (IDE), to write, test, verify and debug these projects.
Block Design - For complex projects, Vivado offers a system design tool based on graphics that allows to connect functional blocks pre-projected and customize the behavior of the system. These blocks are called Intellectual Property (IP) and were designed by Xilinx and other creators.
The following figure gives an overview of the tools on this context:
Vivado also allows the user to perform simulation and validation on the Hardware before it's physical implementation. After the creation and development of the desired functions in Hardware, Vivado is able to generate a Bitstream, which is a binary file that has the information to configure the FPGA with the implemented logic.
Petalinux is an open-source development platform from Xilinx used to create, customize and implement Linux OS (Operational Systems) on Xilinx devices. So, it's possible to configure libraries and packages on the board's OS, define the Kernel, I/O's, drivers, etc. In addition to Software, this is the platform that links the Hardware created on Vivado to the board. After all configurations and after defining the binary file relative to de desired Hardware, we are able to copy specific files to a micro SD card and later start the board with it.
It's also important to mention that Petalinux is a Linux distribution and is normally used on a Virtual Machine so it's possible to make tests and configurations without worrying about damaging the native Operating System. I will guide you trough that process on the specific chapter.
Lastly we have DNNDK which stands for Deep Neural Network Development Kit. This is a set of tools developed by Xilinx to accelerate the development and implementation of Deep Neural Networks (DNN's) on Xilinx FPGA devices. This development kit was specially designed to take advantage of the power of parallel processing and reconfigurable flexibility of the FPGA's to accelerate machine learning tasks and DNN inference.
This tool will be essential on the development of a object detector because it will be responsible to compress the YOLO model to fit the PYNQ-Z2 requirements. Also, the resulting file will communicate directly with the DPU to accelerate the Neural Network and make a very quick inference.
Concept check:
With everything we learned so far, we can now link all this tools and concepts to make this project possible. Here is how we are going to do this:
So, as described on the diagram, we will first develop the DPU on Vivado according to the architecture that suites the limitations of the board. After the development, we will have a collection of files relative to the project, including the binary file describing the DPU. Then we will integrate the DPU Hardware on a SD card image followed by the Operating System specs on Petalinux.
Lastly, we will quantize, optimize and compile the YOLO model using DNNDK. We will go through these terms and technicalities on the respective chapter but for now, all we need to know is that there will be a special file that can communicate with the DPU and indicate the characteristics of the YOLO Neural Network. With all that, we will have to do a little bit of programming to establish a connection between the DPU and the rest of the functions executed on the PS.
To communicate with the DPU on the PS, we will use the DNNDK API. It allows to activate some DPU functions so we can order it to execute the inference (go trough the whole Neural Network).
The following diagram gives another idea of the process.