Skip to main content

RKNN Inference Test

1. Introduction to RKNPU

NPU (Neural Processing Unit) is a specialized processor designed to accelerate neural network computations. To meet the demands of artificial intelligence, Rockchip has gradually integrated NPUs into its processors. The NPU integrated into Rockchip processors is called RKNPU. The LuckFox Pico series development board is equipped with Rockchip RV1103/RV1106 chips, which feature Rockchip's 4th generation self-developed NPU. This NPU boasts high computational precision and supports mixed quantization of int4, int8, and int16. RKNPU4.0 is subdivided into RKNPU2, so using RKNPU2 SDK and toolkits is necessary.

2. Introduction to RKNN-Toolkit2

RKNN-Toolkit2 provides C or Python interfaces on the PC platform to simplify the deployment and execution of models. Users can easily accomplish the following tasks with this tool: model conversion, quantization, inference, performance and memory evaluation, quantization accuracy analysis, and model encryption. The RKNN software stack assists users in deploying AI models to Rockchip chips quickly. The overall framework is as follows:

To use RKNPU, users need to first run the RKNN-Toolkit2 tool on their computer to convert the trained model into the RKNN format model, and then deploy it on the development board using the RKNN C API or Python API. This section introduces how users can quickly use RKNPU on the Luckfox Pico series boards.

3 Installation of RKNN-Toolkit2 (PC ubuntu22.04)

3.1 Local Installation

  1. Local Installation

    Operating System VersionUbuntu18.04(x64)Ubuntu20.04(x64)Ubuntu22.04(x64)
    Python Version3.6/3.73.8/3.93.10/3.11
  2. Download RKNN-Toolkit2

    git clone https://github.com/rockchip-linux/rknn-toolkit2
  3. Install Python Environment

    sudo apt-get update
    sudo apt-get install python3 python3-dev python3-pip
    sudo apt-get install libxslt1-dev zlib1g zlib1g-dev libglib2.0-0 libsm6 libgl1-mesa-glx libprotobuf-dev gcc
  4. Install RKNN-Toolkit2 Dependencies

    pip3 install -r rknn-toolkit2/packages/requirements_cpxx-1.6.0.txt

    # such as:
    pip3 install -r rknn-toolkit2/packages/requirements_cp310-1.6.0.txt

    Choose the corresponding dependency package according to different Python versions:

    Python VersionRKNN-Toolkit2 Dependencies
    3.6requirements_cp36-1.6.0.txt
    3.7requirements_cp37-1.6.0.txt
    3.8requirements_cp38-1.6.0.txt
    3.9requirements_cp39-1.6.0.txt
    3.10requirements_cp310-1.6.0.txt
    3.11requirements_cp311-1.6.0.txt
  5. Install RKNN-Toolkit2

    pip3 install rknn-toolkit2/packages/rknn_toolkit2-x.x.x+xxxxxxxx-cpxx-cpxx-linux_x86_64.whl

    # such as:
    pip3 install rknn-toolkit2/packages/rknn_toolkit2-1.6.0+81f21f4d-cp310-cp310-linux_x86_64.whl

    The package name format is: rknn_toolkit2-{version number}+{commit number}-cp{Python version}-cp{Python version}-linux_x86_64.whl. Choose the corresponding installation package according to different Python versions:

    Python VersionRKNN-Toolkit2 Installation Package
    3.6rknn_toolkit2-{版本号}+{commit 号}-cp36-cp36m-linux_x86_64.whl
    3.7rknn_toolkit2-{版本号}+{commit 号}-cp36-cp37m-linux_x86_64.whl
    3.8rknn_toolkit2-{版本号}+{commit 号}-cp36-cp38m-linux_x86_64.whl
    3.9rknn_toolkit2-{版本号}+{commit 号}-cp36-cp39m-linux_x86_64.whl
    3.10rknn_toolkit2-{版本号}+{commit 号}-cp36-cp310m-linux_x86_64.whl
    3.11rknn_toolkit2-{版本号}+{commit 号}-cp36-cp311m-linux_x86_64.whl

    If there are no errors after executing the following command, the installation is successful:

    python3
    from rknn.api import RKNN

3.2 Conda Installation

It is recommended to use Conda to create a Python virtual environment, which allows flexible switching between multiple scenarios and avoids issues caused by version mismatch. Different Python virtual environments are needed for tasks such as training AI models and converting models.

3.2.1 Installing Miniconda Tool

  1. Check if Miniconda or other Conda tools are installed. If the version number is outputted, it means they are installed.

    conda --version
  2. Download the installation package.

    wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh
  3. Install Miniconda.

    chmod 777 Miniconda3-4.6.14-Linux-x86_64.sh
    bash Miniconda3-4.6.14-Linux-x86_64.sh
    • Note: The installation package of Miniconda must use chmod 777 to set permissions.

    • After the installation, press Enter to read the license agreement, type yes to accept the license and continue the installation, then press Enter again to create a miniconda folder in the home directory, where subsequently created virtual environments will be placed. Finally, type yes again to initialize Conda.

  4. In the terminal window of the computer, enter the Conda base environment.

    source ~/miniconda3/bin/activate # Miniconda3 installation directory (modify according to actual installation)
    # After successful activation, the command prompt will change to the following format:
    # (base) xxx@xxx:~$
  5. If you want to automatically activate the Miniconda environment every time you open the terminal, you can add the activation command to your shell configuration file:

    vim nano ~/.bashrc

    # Add the following line to the end of the file:
    source ~/miniconda3/bin/activate

    # Exit the Conda environment
    conda deactivate

3.2.2 Creating RKNN-Toolkit2 Conda Environment

  1. Create the RKNN-Toolkit2 development Conda environment. The -n parameter specifies the environment name, and Python version is set to 3.8 (recommended version).

    conda create -n RKNN-Toolkit2 python=3.8
    • Enter y to confirm the installation of default packages.
  2. Enter the RKNN-Toolkit2 Conda environment.

    conda activate RKNN-Toolkit2
  3. Verify that the Python version is correct.

    python --version
    • Note: In some development environments, the Python version may not switch correctly after creating the Conda environment. Restarting the terminal can resolve this issue.
  4. Get the RKNN-Toolkit2 installation package.

    git clone https://github.com/rockchip-linux/rknn-toolkit2.git
  5. Enter the folder.

    cd rknn-toolkit2
  6. Install the dependencies of RKNN-Toolkit2. Use cp38 as the suffix for the corresponding Conda environment Python version. In this example, we are using version 3.8, so we use dependencies with the suffix cp38.

    pip install tf-estimator-nightly==2.8.0.dev2021122109 
    pip install -r rknn-toolkit2/packages/requirements_cp38-1.6.0.txt -i https://pypi.mirrors.ustc.edu.cn/simple/
    • Downloading without changing the source is too slow and may result in installation failure. If you change the source, sometimes the mirror source you choose may be temporarily unavailable or subject to access restrictions, which may result in download failure. You can try switching to other available mirror sources.

      #Common pip mirror sources:
      Aliyun: http://mirrors.aliyun.com/pypi/simple/
      University of Science and Technology of China: https://pypi.mirrors.ustc.edu.cn/simple/
      Douban: http://pypi.douban.com/simple/
      Tsinghua University: https://pypi.tuna.tsinghua.edu.cn/simple/
      University of Science and Technology of China: http://pypi.mirrors.ustc.edu.cn/simple/
  7. Install RKNN-Toolkit2.

    pip install rknn-toolkit2/packages/rknn_toolkit2-1.6.0+81f21f4d-cp38-cp38-linux_x86_64.whl
    • Choose the installation package file under the packages folder according to the Python version. The 81f21f4d is the commit number, select according to the actual situation. Use the installation package corresponding to python3.8 with the suffix cp38.
  8. Verify if the installation is successful. If there are no errors reported, the installation is successful.

    python
    >>> from rknn.api import RKNN
    • Effect::

4. Model Deployment

4.1 Introduction to ONNX

ONNX (Open Neural Network Exchange) is an intermediate representation format for models used to convert various deep learning training and inference frameworks. It supports multiple deep learning frameworks, including TensorFlow, PyTorch, Caffe, etc. Converting models to the ONNX format makes it easier to deploy and infer in different deep learning frameworks without the need to retrain the models.

In practical applications, models can be trained using PyTorch or TensorFlow, exported to ONNX format, and then converted to the model format supported by the target device, such as TensorRT Engine, NCNN, MNN, RKNN, etc. ONNX defines a set of platform-independent standard formats and environments to enhance the interoperability of various AI models, making it relatively open.

4.2 Obtaining ONNX Models

The ONNX file not only stores the weights of the neural network model but also stores the structural information of the model and the input and output information of each layer in the network. It can be viewed using Netron, which facilitates subsequent model adjustments in development work. Depending on the model, you can get the source code. The GitHub address of the model training and inference source code used in this example is as follows:

ModelAddress
Retianfacehttps://github.com/bubbliiiing/retinaface-pytorch
Facenethttps://github.com/bubbliiiing/facenet-pytorch
Yolov5https://github.com/airockchip/rknn_model_zoo.git

4.3 Multi-Object Scene Inference Test

The model used for face detection is Retinaface, which can extract the bounding box coordinates, confidence, and coordinates of five facial landmarks. By cropping the image based on the bounding box of the face, the input image for facial feature extraction can be obtained, which improves the reliability of obtaining facial feature values.

  1. Get the Retinaface source code.

    git clone https://github.com/bubbliiiing/retinaface-pytorch.git
  2. Enter the source code directory.

    cd retinaface-pytorch
  3. Set up the model training environment.

    conda create -n retinaface python=3.6
    • Enter y to agree to install basic python tools.
  4. Enter the Conda virtual environment and install the required libraries.

    conda activate retinaface
    pip install -r requirements.txt
    • The .pth weight files trained are stored in the model_data folder, and the weight file exported to .onnx format is chosen using mobilenet as the backbone network.
  5. The .pth weight files trained are stored in the model_data folder, and the weight file exported to .onnx format is chosen using mobilenet as the backbone network.

    from nets.retinaface import RetinaFace
    from utils.config import cfg_mnet
    import torch

    model_path='model_data/Retinaface_mobilenet0.25.pth' # Model path
    model=RetinaFace(cfg=cfg_mnet,pretrained = False) # Model initialization
    device = torch.device('cpu')
    model.load_state_dict(torch.load(model_path,map_location=device),strict=False) # Model loading
    net=model.eval()
    example=torch.rand(1,3,640,640) # Given input
    torch.onnx.export(model,(example),'model_data/retinaface.onnx',verbose=True,opset_version=9) # Export
  6. Execute the script to obtain the ONNX file (in the retinaface Conda environment).

    python export_onnx.py
    • The converted ONNX file can be obtained in <project folder>/model_data.

4.4 Face Feature Extraction

Face feature extraction can extract 128-dimensional features based on the input face image, which can be used to calculate the Euclidean distance with the features of other faces to measure the degree of matching.

  1. Get the Facenet source code.

    git clone https://github.com/bubbliiiing/facenet-pytorch.git
  2. Enter the source code directory.

    cd facenet-pytorch
  3. Set up the model training environment.

    conda create -n facenet python=3.6
    • Enter y to agree to install basic python tools.
  4. Enter the Conda virtual environment and install the required libraries.

    conda activate facenet
    pip install -r requirements.txt
    • The .pth weight files trained are stored in the model_data folder.
  5. Create a Python script export_onnx.py in the project folder to export the ONNX file.

    from nets.facenet import Facenet
    from torch import onnx
    import torch

    model_path='model_data/facenet_mobilenet.pth' # Model path
    model = Facenet(backbone="mobilenet",mode="predict",pretrained=True) # Model initialization
    device = torch.device('cpu')
    model.load_state_dict(torch.load(model_path, map_location=device), strict=False)
    example=torch.rand(1,3,160,160) # Given input
    torch.onnx.export(model,example,'model_data/facenet.onnx',verbose=True,opset_version=9) # Export
  6. Execute the script to obtain the ONNX file (in the facenet Conda environment).

    python export_onnx.py 
    • The converted ONNX file can be obtained in <project folder>/model_data.

4.5 Object Detection

YOLOv5 is an open-source version of the YOLO algorithm developed by Ultralytics. It is a single-stage object detection algorithm implemented entirely in PyTorch and written in Python.

YOLOv5 has five models, listed as follows:

\1. YOLOv5n: This is the smallest and fastest model. It is also known as the Nano model and is suitable for applications on mobile devices due to its size. \2. YOLOv5s: This is also a relatively small model. It is used for running results on CPU. \3. YOLOv5m: As indicated by "m," this is a medium-sized model. \4. YOLOv5l: This is a large model. \5. YOLOv5x: This is the top model among various variants. However, the size compromises speed.

The basic principle of YOLOv5 is to extract image features through convolutional neural networks, perform object detection prediction on each grid cell based on grid division, predict bounding box positions and categories, and assign confidence scores. Finally, non-maximum suppression (NMS) is used to filter and merge overlapping bounding boxes to obtain the final object detection results.

  1. Get Yolov5 source code.

    git clone https://github.com/airockchip/yolov5.git
  2. Enter Yolov5 source code directory.

    cd yolov5
  3. Set up the model training environment.

    conda create -n yolov5 python=3.9
    • Enter y to agree to install basic python tools.
  4. Enter the Conda virtual environment and install the required libraries.

    conda activate yolov5
    pip install -r requirements.txt
  • Note: If you cannot download all the required software packages, try setting pip's mirror source.
  1. Export ONNX file from the default file (in yolov5 conda environment)

     python export.py --rknpu --weight yolov5s.pt
    • If the yolov5s.pt weight file is not present in the project directory, it will automatically pull the file and convert it. The project folder will generate yolov5s.onnx and RK_anchors.txt files. The post_process function in the object detection source code will use parameters from RK_anchors.txt. The generated yolov5s.onnx model removes parts incompatible with RKNPU from the standard Yolov5 model, implementing them on CPU in the application source code. If using a custom-trained model, be sure to include the --rknpu parameter in the export.py script to obtain the necessary data for RKNPU applications.
    python export.py --rknpu --weight xxx.pt

4.6 Adjusting the ONNX Model

Use the Netron tool to view the structure of the ONNX model and determine whether there are operators that RV1103/RV1106 temporarily do not support (such as Layer Normalization). Refer to the manual (https://github.com/rockchip-linux/rknn-toolkit2/blob/master/doc/05_RKNN_Compiler_Support_Operator_List_v1.6.0.pdf). If unsupported operators are located in the last few layers of the model, consider implementing them using CPU. Using the Netron tool to observe the facenet source code model, it can be found that the model uses a ReduceL2 operator that RKNPU cannot parse before output.

Referring to the source code path facenet-pytorch/nets/facenet.py, the source code of the forward function reveals that a normalization process is performed before outputting the 128-dimensional feature vector during the inference phase of the model.

def forward(self, x, mode = "predict"):
if mode == 'predict':
x = self.backbone(x)
x = self.avg(x)
x = x.view(x.size(0), -1)
x = self.Dropout(x)
x = self.Bottleneck(x)
x = self.last_bn(x)
x = F.normalize(x, p=2, dim=1) # normalization before output
return x
x = self.backbone(x)
x = self.avg(x)
x = x.view(x.size(0), -1)
x = self.Dropout(x)
x = self.Bottleneck(x)
before_normalize = self.last_bn(x)

x = F.normalize(before_normalize, p=2, dim=1)
cls = self.classifier(before_normalize)
return x, cls

This part can be assigned to run on the CPU. After commenting out the corresponding code, re-export the ONNX model.

Note: Only facenet in the instance needed adjustments in the source code.

5. Luckfox RKNN Application Example

5.1 Platform Support

DemoSystemCameraScreen
luckfox_pico_retinaface_facenetBuildrootsc3336Pico-1.3-LCD LF40-480480-ARK
luckfox_pico_retinaface_facenet_spidevBuildrootsc3336Pico-ResTouch-LCD-2.8 Pico-ResTouch-LCD-3.5
luckfox_pico_yolov5Buildrootsc3336Pico-1.3-LCD LF40-480480-ARK

Note: Support for screens on the Luckfox Pico varies. You can refer to the Compatibility List. If a compatible screen is not available, you can also view inference results via the terminal.

5.2 Verification

Before testing, Framebuffer support must be enabled. Please refer to the "luckfox-config Configuration" section for instructions.

  1. Test screen distortion

    cat /dev/urandom > /dev/fb0
  2. Test screen clear

    cat /dev/zero > /dev/fb0
    • If the screen distortion and clearing functions work properly, it means the driver can be loaded normally.

5.3 Convert to RKNN Model

5.3.1 Model Conversion

  1. Get source code

    git clone https://github.com/LuckfoxTECH/luckfox_pico_rknn_example.git
  2. Navigate to the scripts/luckfox_onnx_to_rknn directory

    cd luckfox_pico_rknn_example/scripts/luckfox_onnx_to_rknn
  3. File Structure

    luckfox_onnx_to_rknn
    ├── convert--------------------------------------Model conversion python script
    ├── dataset--------------------------------------Model conversion reference dataset
    │ └── pic
    │ ├── facenet
    │ │ └── face.jpg
    │ ├── retinaface
    │ │ └── face.jpg
    │ └── yolov5
    │ └── bus.jpg
    └── model----------------------------------------ONNX models and RKNN models
  4. Enter RKNN-Toolkit2 Conda development environment

    conda activate RKNN-Toolkit2
    • Note: Failure to enter the correct Conda virtual environment can result in model conversion failure.
  5. Model conversion

    cd convert
    python convert.py <onnx_model_path> <dataset_path> <export_model_path> <model_type(Retinaface etc.)>
  6. Example

    convert.py ../model/retinaface.onnx ../dataset/retinaface_dataset.txt ../model/retinaface.rknn Retinaface
    • onnx_model_path: Path to the exported ONNX model, provided in the luckfox_onnx_to_rknn/model directory in the example
    • dataset_path: Provide a small number of images as references for model conversion by specifying their file paths in a .txt file as a parameter to the conversion script.
    • export_model_path: Name and location of the exported RKNN model, ensure it has a .rknn extension
    • model_type: Different RKNN preprocessing settings are provided based on the type of model being converted. In the example, input "Retinaface", "Facenet", or "Yolov5".

5.3.2 Custom Configuration

When converting models, it is important to adjust preprocessing settings based on the preprocessing code in the model training source code. For example, in Retinaface, the image preprocessing code subtracts the mean values of the three channels. Therefore, when calling the RKNN.Config interface, the mean_values parameter needs to be configured.

  1. Input preprocessing in the Retinaface model source code

    def preprocess_input(image):
    image -= np.array((104, 117, 123),np.float32)
    return image
  2. Model conversion configuration in RKNN-Toolkit2

    rknn.config(mean_values=[[104, 117, 123]], std_values=[[1, 1, 1]], target_platform=platform,
    quantized_algorithm="normal", quant_img_RGB2BGR=True)

5.4 Model Verification

5.4.1 Preliminary Analysis of the Model

The internal structure of the output RKNN model cannot be parsed by Netron tool, but basic information about inputs and outputs can be examined for initial assessment.

  • luckfox-pico only supports int8 type inputs and outputs

        If the converted model has `float` type input and output tensor structures, data may not be correctly retrieved, necessitating a re-adjustment of the model structure.
  • luckfox-pico only supports 4-dimensional input and output dimensions

5.4.2 Software Simulator Model Verification

Currently, luckfox-pico only supports model inference using the C-API on the board. However, during the model verification phase, the RKNN-Toolkit2 Python interface can be used to run tests on a software simulator, leveraging Python's rich third-party libraries to write test programs. The process of model verification using RKNN-Toolkit2.

  1. Create RKNN object

    rknn = RKNN()
  2. Preprocessing configuration. Configure according to the image preprocessing code in the model training source code.

    rknn.config(mean_values=[[0, 0, 0]], std_values=[[128, 128, 128]],
    target_platform='rv1103',
    quantized_algorithm="normal")
  3. Load the model

    rknn.load_onnx(model = model_path)
  4. Build the model

    rknn.build(do_quantization=do_quant, dataset=DATASET_PATH)
  5. Initialize runtime environment. Passing the target parameter will use the adb tool to remotely control model inference on the board. By default, without passing parameters, model inference is performed on the software simulator.

    rknn.init_runtime()
  6. Input and output processing. Depending on the third-party library used, there are different ways to handle input and output. Refer to the processing methods in the model training source code.

  7. Release the RKNN object

    rknn.release()
    • Output data from the software simulator model inference can be compared with the input and output data from the board's model inference to determine if the model deployment environment is correct.
  • Note: Output data from the software simulator is not quantized and retains the original format of the ONNX model's output (mostly floating point type). However, luckfox-pico's RKNPU only supports int8 type outputs. For output data comparison, luckfox-pico's output data needs to be converted back to the original format before quantization.

5.5 Application Design

The application flow for AI model inference

  1. RKNN initialization
  • During RKNN initialization, system memory allocation or custom memory allocation can be chosen based on the model's memory usage. System memory allocation is recommended for general models.

    rknn_init(&ctx, mode_path, 0, 0, NULL);
  • If multiple models need to be run sequentially and memory resources are tight, custom memory allocation can be used to allow intermediate tensor memory reuse between two models. Memory allocation should be released promptly when using this method.

    rknn_context ctx_a, ctx_b;

    rknn_init(&ctx_a, model_path_a, 0, RKNN_FLAG_MEM_ALLOC_OUTSIDE, NULL);
    rknn_query(ctx_a, RKNN_QUERY_MEM_SIZE, &mem_size_a, sizeof(mem_size_a));

    rknn_init(&ctx_b, model_path_b, 0, RKNN_FLAG_MEM_ALLOC_OUTSIDE, NULL);
    rknn_query(ctx_b, RKNN_QUERY_MEM_SIZE, &mem_size_b, sizeof(mem_size_b));

    max_internal_size = MAX(mem_size_a.total_internal_size, mem_size_b.total_internal_size);
    internal_mem_max = rknn_create_mem(ctx_a, max_internal_size);

    internal_mem_a = rknn_create_mem_from_fd(ctx_a, internal_mem_max->fd, internal_mem_max->virt_addr, mem_size_a.total_internal_size, 0);
    rknn_set_internal_mem(ctx_a, internal_mem_a);
    internal_mem_b = rknn_create_mem_from_fd(ctx_b, internal_mem_max->fd, internal_mem_max->virt_addr, mem_size_b.total_internal_size, 0);
    rknn_set_internal_mem(ctx_b, internal_mem_b);
  1. Getting input and output parameters of the RKNN model

    // Get the number of input and output channels
    rknn_query(ctx, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num));

    // Get parameters of each input channel
    rknn_query(ctx, RKNN_QUERY_NATIVE_INPUT_ATTR, &(input_attrs[i]), sizeof(rknn_tensor_attr));

    // Get parameters of each output channel
    rknn_query(ctx, RKNN_QUERY_NATIVE_OUTPUT_ATTR, &(output_attrs[i]), sizeof(rknn_tensor_attr));
  2. Memory allocation and setting for input and output of the RKNN model

    // Allocate memory for each input channel
    rknn_tensor_mem* input_mems[i] = rknn_create_mem(ctx, input_attrs[i].size_with_stride);
    // Set memory for each input channel
    rknn_set_io_mem(ctx, app_ctx->input_mems[i], &input_attrs[0]);


    // Allocate memory for each output channel
    rknn_tensor_mem* output_mems[i] = rknn_create_mem(ctx, output_attrs[i].size_with_stride);
    // Set memory for each output channel
    rknn_set_io_mem(ctx, app_ctx->output_mems[i], &output_attrs[i]);
  3. Input image

  • The instance uses OpenCV-Mobile to capture camera data from the SC3336 camera. By default, the color channel order is B-G-R, stored in cv::Mat format, which needs to be converted to the required input format for the model.

    // Capture camera image
    cap >> bgr;
    // Resize to the required size by the model
    cv::resize(bgr, bgr, cv::Size(width,height), 0, 0, cv::INTER_LINEAR);
    for (int y = 0; y < height; ++y) {
    for (int x = 0; x < width; ++x) {
    cv::Vec3b pixel = bgr.at<cv::Vec3b>(y, x);
    src_image[(y * width + x) * channels + 0] = pixel[2]; // Red
    src_image[(y * width + x) * channels + 1] = pixel[1]; // Green
    src_image[(y * width + x) * channels + 2] = pixel[0]; // Blue
    }
    }
    // Copy data to the virtual address of the input tensor
    memcpy(input_mems[0]->virt_addr, src_image,width*height*channels);
  1. RKNN model inference
  • luckfox-pico uses zero-copy API. After calling the rknn_run interface, output data will be synchronized to the virtual address of the set output memory.

    rknn_run(ctx,nullptr);
  1. Output data analysis and processing
  • luckfox-pico's RKNPU only supports integer type outputs. RKNN models quantize both input and output data, so output data from luckfox-pico needs to be converted back to the original format before quantization before any further processing.

    float deqnt = ((float)qnt - (float)zp) * scale;
  • If the model has been pruned in the training phase, CPU computation needs to use the format-converted output data. For example, in Facenet, the final normalization process was removed from the ONNX model export to make it compatible with RKNPU. Therefore, after format conversion, normalization needs to be added to the output data.

    float sum = 0;
    for(int i = 0; i < 128; i++)
    sum += out_fp32[i] * out_fp32[i];
    float norm = sqrt(sum);
    for(int i = 0; i < 128; i++)
    out_fp32[i] /= norm;
  1. Displaying the image
  • Processed results of the output data can be displayed using OpenCV-Mobile for visual observation. The Pico-LCD-1.3 screen resolution is 240 x 240, with RGB565 format. Therefore, image data processed by OpenCV-Mobile needs to be converted for proper display.

    cv::resize(bgr, bgr, cv::Size(240,240), 0, 0, cv::INTER_LINEAR);
    for (int i = 0; i < bgr.rows; ++i) {
    for (int j = 0; j < bgr.cols; ++j) {
    uint16_t b = (bgr.at<cv::Vec3b>(i, j)[0] >> 3);
    uint16_t g = (bgr.at<cv::Vec3b>(i, j)[1] >> 2) << 5;
    uint16_t r = (bgr.at<cv::Vec3b>(i, j)[2] >> 3) << 11;

    rgb565Image.at<uint16_t>(i, j) = r | g | b;
    framebuffer[i * FB_HIGHT + j] = rgb565Image.at<uint16_t>(i,j);
    }
    }

5.6 Compilation and Execution

5.6.1 Compilation (Execute on PC)

  1. Repository URL (Refer to section 5.3 for converting to RKNN models)

    https://github.com/LuckfoxTECH/luckfox_pico_rknn_example.git
  2. Set environment variables

    export LUCKFOX_SDK_PATH=< luckfox-pico Sdk 地址>

    Note: Use absolute paths.

  3. Run ./build.sh and choose the example to compile

    1) luckfox_pico_retinaface_facenet
    2) luckfox_pico_retinaface_facenet_spidev
    3) luckfox_pico_yolov5
    Enter your choice [1-3]:
  4. The luckfox_pico_retinaface_facenet_spidev option is specifically adapted for Pico-ResTouch-LCD. Choose the Luckfox Pico model to determine the pins

    1) LUCKFOX_PICO_PLUS
    2) LUCKFOX_PICO_PRO_MAX
    Enter your choice [1-2]:

5.6.2 Execution (Execute on Luckfox Pico Board)

  1. After compilation, the corresponding deployment folder will be generated in the luckfox_pico_rknn_example/install directory (referred to as < Demo Dir >)

    luckfox_pico_retinaface_facenet_demo
    luckfox_pico_retinaface_facenet_spidev_pro_max_demo
    luckfox_pico_retinaface_facenet_spidev_plus_demo
    luckfox_pico_yolov5_demo
  2. Upload the complete < Demo Dir > to Luckfox Pico (using adb, ssh, etc.) and execute

    # Run on Luckfox Pico board, <Demo Target> is the executable in the deployment folder
    cd <Demo Dir>
    chmod a+x <Demo Target>
  3. luckfox_pico_retinaface_facenet

    ./luckfox_pico_retinaface_facenet <retinaface model> <facenet model> <reference image> 
    # Example: ./luckfox_pico_retinaface_facenet ./model/RetinaFace.rknn ./model/mobilefacenet.rknn ./model/test.jpg
  4. luckfox_pico_retinaface_facenet_spidev

    ./luckfox_pico_retinaface_facenet_spidev <retinaface model> <facenet model> <reference image>
    # Example: ./luckfox_pico_retinaface_facenet_spidev ./model/RetinaFace.rknn ./model/mobilefacenet.rknn ./model/test.jpg
  5. luckfox_pico_yolov5

    ./luckfox_pico_yolov5 <yolov5 model> 
    # Example: ./luckfox_pico_yolov5 ./model/yolov5.rknn

    Note:

    • Before running the demo, execute RkLunch-stop.sh to close the default background program rkicp on the Luckfox Pico and release the camera.

      Place RKNN models and related configuration files in < Demo Dir >/model for quick verification.

5.6.3 Example Results

The examples provide source code for object recognition and face recognition, which can be used as a reference for deploying other AI models. Both examples use the opencv-mobile driver to capture images from the sc3335 camera and display the processed results on the screen.

  1. The face recognition example uses the RetinaFace model for face detection and the FaceNet model for face feature extraction, comparing with the reference face to calculate the Euclidean distance (the difference in feature values; a smaller value indicates a higher match).

  2. The object recognition example uses the Yolov5 convolutional neural network model to identify 80 types of objects and display their confidence scores.

6. RKNN_Model_Zoo Application Example

rknn_model_zoo provides deployment examples for various mainstream algorithms supported by RKNPU. Currently, there is limited support for the RV1103/RV1106 used in Luckfox Pico. The latest examples support deploying Mobilenet and YOLO models. This section introduces the use of rknn_model_zoo examples with Yolov5 deployment.

6.1 Export RKNN Model

  1. Download rknn_model_zoo

    git clone https://github.com/airockchip/rknn_model_zoo.git
  2. Obtain Yolov5 ONNX model file

    cd <rknn_model_zoo Path>/rknn_model_zoo/examples/yolov5/model
    chmod a+x download_model.sh
    ./download_model.sh
  3. xecute the model conversion program convert.py in the rknn_model_zoo/examples/yolov5/python directory. Usage:

    conda activate RKNN-Toolkit2
    cd <rknn_model_zoo Path>/rknn_model_zoo/examples/yolov5/python
    python3 convert.py ../model/yolov5s.onnx rv1106
    # output model will be saved as ../model/yolov5.rknn
    python3 convert.py <onnx_model> <TARGET_PLATFORM> <dtype(optional)> <output_rknn_path(optional)>

    Parameter description:

    • <onnx_model>: Path to the ONNX model.

      <TARGET_PLATFORM>: Specify the NPU platform name, e.g., "rv1106".

      <quant_dtype>: Optional. Can be specified as i8 or fp. i8 indicates quantization, fp indicates no quantization. Default is i8.

      <output_rknn_path>: Optional. Specifies the path to save the RKNN model. Defaults to the same directory as the ONNX model, named 'yolov5.rknn'.

6.2 Compilation and Building

  1. After successfully converting the ONNX model to RKNN model, perform cross-compilation on the rknn_model_zoo/examples/yolov5 directory. Set the following environment variables before compiling:

    export GCC_COMPILER=<SDK path>/tools/linux/toolchain/arm-rockchip830-linux-uclibcgnueabihf/bin/arm-rockchip830-linux-uclibcgnueabihf
  2. Execute the build-linux.sh script in the rknn_model_zoo directory to compile the example:

    chmod +x ./build-linux.sh
    ./build-linux.sh -t rv1106 -a armv7l -d yolov5
    • 编译过程:

      (RKNN-Toolkit2) luckfox@luckfox:~/rknn_model_zoo$ ./build-linux.sh -t rv1106 -a armv7l -d yolov5
      ./build-linux.sh -t rv1106 -a armv7l -d yolov5
      /home/luckfox-pico/tools/linux/toolchain/arm-rockchip830-linux-uclibcgnueabihf/bin/arm-rockchip830-linux-uclibcgnueabihf
      ===================================
      BUILD_DEMO_NAME=yolov5
      BUILD_DEMO_PATH=examples/yolov5/cpp
      TARGET_SOC=rv1106
      TARGET_ARCH=armv7l
      BUILD_TYPE=Release
      ENABLE_ASAN=OFF
      INSTALL_DIR=/home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo
      BUILD_DIR=/home/rknn_model_zoo/build/build_rknn_yolov5_demo_rv1106_linux_armv7l_Release
      CC=/home/luckfox-pico/tools/linux/toolchain/arm-rockchip830-linux-uclibcgnueabihf/bin/arm-rockchip830-linux-uclibcgnueabihf-gcc
      CXX=/home/luckfox-pico/tools/linux/toolchain/arm-rockchip830-linux-uclibcgnueabihf/bin/arm-rockchip830-linux-uclibcgnueabihf-g++
      ===================================
      -- Configuring done
      -- Generating done
      -- Build files have been written to: /home/rknn_model_zoo/build/build_rknn_yolov5_demo_rv1106_linux_armv7l_Release
      Consolidate compiler generated dependencies of target imagedrawing
      Consolidate compiler generated dependencies of target imageutils
      Consolidate compiler generated dependencies of target fileutils
      [ 40%] Built target fileutils
      [ 40%] Built target imagedrawing
      [ 60%] Built target imageutils
      Consolidate compiler generated dependencies of target rknn_yolov5_demo
      [100%] Built target rknn_yolov5_demo
      [ 20%] Built target imageutils
      [ 40%] Built target fileutils
      [ 60%] Built target imagedrawing
      [100%] Built target rknn_yolov5_demo
      Install the project...
      -- Install configuration: "Release"
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/./rknn_yolov5_demo
      -- Set runtime path of "/home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/./rknn_yolov5_demo" to "$ORIGIN/lib"
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/./model/bus.jpg
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/./model/coco_80_labels_list.txt
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/model/yolov5.rknn
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/lib/librknnmrt.so
      -- Installing: /home/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo/lib/librga.so
  3. After cross-compilation, an install directory will be generated in the rknn_model_zoo directory, containing the compiled program and library files.

    (RKNN-Toolkit2) luckfox@luckfox:~/rknn_model_zoo/install/rv1106_linux_armv7l/rknn_yolov5_demo$ ls
    lib model rknn_yolov5_demo

6.3 Running the Program

  1. Transfer the entire rknn_yolov5_demo directory to the development board and then run the following commands:

    cd /root/rknn_yolov5_demo/
    ./rknn_yolov5_demo model/yolov5.rknn model/bus.jpg
  2. After inference, an image out.png will be generated

    # ls
    lib model out.png rknn_yolov5_demo

6.4 Experimental Results

  • Fruit inference test

  • Multi-object scene inference test

7. FAQ

  1. Error finding files when running yolov5_demo_test on the development board.

    Answer: Try executing `killall rkipc` or `RkLunch-stop.sh` to close the system's default `rkipc` program to release the camera before running the demo.