Qwen-VL

RK推理模型文件下载链接：https://pan.baidu.com/s/1HFSD9-JyLCB9J4ar0EMOwQ 提取码: root

1.编译可执行程序

1.进入源码目录

cd ~/rknn-llm/examples/Qwen2-VL_Demo/deploy

2.修改交叉编译工具链

vi build-linux.sh

将原本的：

set -e
rm -rf build
mkdir build && cd build

GCC_COMPILER=~/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu
cmake .. -DCMAKE_CXX_COMPILER=${GCC_COMPILER}/bin/aarch64-none-linux-gnu-g++  \
        -DCMAKE_C_COMPILER=${GCC_COMPILER}/bin/aarch64-none-linux-gnu-gcc \
        -DCMAKE_BUILD_TYPE=Release \
        -DCMAKE_SYSTEM_NAME=Linux \
        -DCMAKE_SYSTEM_PROCESSOR=aarch64 \

make -j8
make install

修改为：

set -e
rm -rf build
mkdir build && cd build

GCC_COMPILER=aarch64-linux-gnu
cmake .. -DCMAKE_CXX_COMPILER=${GCC_COMPILER}-g++  \
        -DCMAKE_C_COMPILER=${GCC_COMPILER}-gcc \
        -DCMAKE_BUILD_TYPE=Release \
        -DCMAKE_SYSTEM_NAME=Linux \
        -DCMAKE_SYSTEM_PROCESSOR=aarch64 \

make -j8
make install

3.安装cmake

sudo apt install cmake -y

4.增加可执行权限并执行编译

chmod +x build-linux.sh
./build-linux.sh

运行效果：

 baiwen@dshanpi-a1:~/rknn-llm/examples/Qwen2-VL_Demo/deploy$ ./build-linux.sh
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/aarch64-linux-gnu-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/aarch64-linux-gnu-g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenCV: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/3rdparty/opencv-linux-aarch64 (found version "3.4.5")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/build
[ 12%] Building CXX object CMakeFiles/llm.dir/src/llm.cpp.o
[ 25%] Building CXX object CMakeFiles/demo.dir/src/image_enc.cc.o
[ 37%] Building CXX object CMakeFiles/demo.dir/src/main.cpp.o
[ 50%] Building CXX object CMakeFiles/imgenc.dir/src/image_enc.cc.o
[ 62%] Building CXX object CMakeFiles/imgenc.dir/src/img_encoder.cpp.o
cc1plus: warning: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/src/image_enc.h: not a directory
cc1plus: warning: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/src/image_enc.h: not a directory
cc1plus: warning: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/src/image_enc.h: not a directory
cc1plus: warning: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/src/image_enc.h: not a directory
cc1plus: warning: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/src/image_enc.h: not a directory
[ 75%] Linking CXX executable llm
[ 75%] Built target llm
[ 87%] Linking CXX executable imgenc
[100%] Linking CXX executable demo
[100%] Built target imgenc
[100%] Built target demo
Consolidate compiler generated dependencies of target imgenc
[ 37%] Built target imgenc
Consolidate compiler generated dependencies of target llm
[ 62%] Built target llm
Consolidate compiler generated dependencies of target demo
[100%] Built target demo
Install the project...
-- Install configuration: "Release"
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./imgenc
-- Set runtime path of "/home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./imgenc" to ""
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./llm
-- Set runtime path of "/home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./llm" to ""
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./demo
-- Set runtime path of "/home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./demo" to ""
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/lib/librknnrt.so
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/lib/librkllmrt.so
-- Installing: /home/baiwen/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64/./demo.jpg

5.进入可执行文件目录

cd install/demo_Linux_aarch64/

6.将预训练和转换完成的模型文件传输至开发板端中

baiwen@dshanpi-a1:~/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64$ ls
demo      imgenc  llm                               qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm  Qwen3-1.7B-rk3576-w4a16.rkllm
demo.jpg  lib     qwen2_5_vl_3b_vision_rk3576.rknn  Qwen3-0.6B-rk3576-w4a16.rkllm            Qwen3-4B-rk3576-w4a16.rkllm

2.模型推理

2.1 Qwen2.5-vl-3B

1.导入依赖和环境变量

export LD_LIBRARY_PATH=./lib
export RKLLM_LOG_LEVEL=1

2.执行程序

./llm ./qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm 128 512

argv[1]model_pathRKLLM 转换后的模型文件路径，例如 ./qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm。

argv[2]max_new_tokens每次生成回答时最多输出多少个 token（整数）。例如 512。

argv[3]max_context_len模型一次能接受的最大上下文长度（提示 + 已生成的文本）。例如 2048。

运行效果如下：

2.2 Qwen3-0.6B

1.导入依赖和环境变量

export LD_LIBRARY_PATH=./lib
export RKLLM_LOG_LEVEL=1

2.执行程序

./llm ./Qwen3-0.6B-rk3576-w4a16.rkllm 128 512

运行效果如下：

2.3 Qwen3-1.7B

1.导入依赖和环境变量

export LD_LIBRARY_PATH=./lib
export RKLLM_LOG_LEVEL=1

2.执行程序

./llm ./Qwen3-1.7B-rk3576-w4a16.rkllm 128 512

运行效果如下：

2.4 Qwen3-4B

1.导入依赖和环境变量

export LD_LIBRARY_PATH=./lib
export RKLLM_LOG_LEVEL=1

2.执行程序

./llm ./Qwen3-4B-rk3576-w4a16.rkllm 128 512

运行效果如下：

3.多模态模型部署

提示

注意：本程序需要在大于6GB内存的硬件上运行，如果内存小于6GB，可能会运行失败！

由于我们使用的是qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm文本处理模型和qwen2_5_vl_3b_vision_rk3576.rknn图像处理模型，共同推理实现识别图像内容回复文本。

2.1 修改源码

进入~/rknn-llm/examples/Qwen2-VL_Demo/deploy/src源码目录，修改main.cpp，将代码中的：

#define EMBED_SIZE 1536

修改为：

#define EMBED_SIZE 2048

原因是：Qwen2-VL 系列模型提供 2B、3B、7B 等多种参数规模。不同规模的模型，其视觉编码器（Vision Encoder + Projector）输出的图像特征向量维度各不相同：2B 模型为 1536 维，3B 模型为 2048 维，7B 模型为 3584 维。我们使用的3B的模型，需填入：2048

修改完成后，需要重新编译源码，回到编译目录下：

cd ~/rknn-llm/examples/Qwen2-VL_Demo/deploy

执行编译命令：

./build-linux.sh

进入可执行文件目录：

cd install/demo_Linux_aarch64/

将预训练和转换完成的模型文件传输至开发板端中：

baiwen@dshanpi-a1:~/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64$ ls
demo  demo.jpg  imgenc  lib  llm  qwen2_5_vl_3b_vision_rk3576.rknn  qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm

2.2 运行程序

1.导入依赖和环境变量

export LD_LIBRARY_PATH=./lib
export RKLLM_LOG_LEVEL=1

2.执行程序

./demo ./demo.jpg ./qwen2_5_vl_3b_vision_rk3576.rknn ./qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm 128 512 3

执行后如下所示：

baiwen@dshanpi-a1:~/rknn-llm/examples/Qwen2-VL_Demo/deploy/install/demo_Linux_aarch64$ ./demo ./demo.jpg ./qwen2_5_vl_3b_vision_rk3576.rknn ./qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm 128 512 3
I rkllm: rkllm-runtime version: 1.2.1, rknpu driver version: 0.9.8, platform: RK3576
I rkllm: loading rkllm model from ./qwen2.5-vl-3b-w4a16_level1_rk3576.rkllm
I rkllm: rkllm-toolkit version: 1.2.1, max_context_limit: 4096, npu_core_num: 2, target_platform: RK3576, model_dtype: W4A16
I rkllm: Enabled cpus: [4, 5, 6, 7]
I rkllm: Enabled cpus num: 4
I rkllm: Using mrope
rkllm init success
main: LLM Model loaded in  7346.11 ms
===the core num is 3===
E RKNN: [08:04:38.229] rknn_set_core_mask: unavailable core mask found for current platform! max core mask = 3
model input num: 1, output num: 1
input tensors:
  index=0, name=pixel, n_dims=4, dims=[1, 392, 392, 3], n_elems=460992, size=921984, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
output tensors:
  index=0, name=34548, n_dims=2, dims=[196, 2048, 0, 0], n_elems=401408, size=802816, fmt=UNDEFINED, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000
model input height=392, width=392, channel=3
main: ImgEnc Model loaded in  7752.90 ms
I rkllm: reset chat template:
I rkllm: system_prompt: <|im_start|>system\nYou are a helpful assistant.<|im_end|>\n
I rkllm: prompt_prefix: <|im_start|>user\n
I rkllm: prompt_postfix: <|im_end|>\n<|im_start|>assistant\n
W rkllm: Calling rkllm_set_chat_template will disable the internal automatic chat template parsing, including enable_thinking. Make sure your custom prompt is complete and valid.

**********************可输入以下问题对应序号获取回答/或自定义输入********************

[0] <image>What is in the image?
[1] <image>这张图片中有什么？

*************************************************************************

user:

如果想让大模型分析图片中的内容可输入一下内容：

<image>What is in the image?

运行结果参考如下：

user: <image>What is in the image?
robot: The image appears to be a digitally created or manipulated scene featuring an astronaut on the moon's surface, lying down with their head resting on what seems to be a helmet or spacecraft part. The astronaut is holding a bottle of beer and has another bottle next to them. In the background, there are elements such as Earth visible through a crater in the lunar landscape, and some industrial structures like stairs leading up to an object that resembles a ladder or ramp.

The overall setting suggests a humorous or surreal take on space exploration, blending elements of everyday life with the vastness of outer space.
I rkllm: --------------------------------------------------------------------------------------
I rkllm:  Model init time (ms)  5482.81
I rkllm: --------------------------------------------------------------------------------------
I rkllm:  Stage         Total Time (ms)  Tokens    Time per Token (ms)      Tokens per Second
I rkllm: --------------------------------------------------------------------------------------
I rkllm:  Prefill       4143.53          223       18.58                    53.82
I rkllm:  Generate      20911.60         116       180.27                   5.55
I rkllm: --------------------------------------------------------------------------------------
I rkllm:  Peak Memory Usage (GB)
I rkllm:  4.55
I rkllm: --------------------------------------------------------------------------------------

1.编译可执行程序​

2.模型推理​

2.1 Qwen2.5-vl-3B​

2.2 Qwen3-0.6B​

2.3 Qwen3-1.7B​

2.4 Qwen3-4B​

3.多模态模型部署​

2.1 修改源码​

2.2 运行程序​

1.编译可执行程序

2.模型推理

2.1 Qwen2.5-vl-3B

2.2 Qwen3-0.6B

2.3 Qwen3-1.7B

2.4 Qwen3-4B

3.多模态模型部署

2.1 修改源码

2.2 运行程序