feat(ml): ML on Rockchip NPUs #15241

yoni13 · 2025-01-11T03:04:45Z

Goals: ML on Rockchip NPUs.
Testing on board: #13243 (reply in thread)

TODO:

Nice to have:

Rebase my commits (sorry for ugly commit messages).
Test if it's working on RK3588 (I don't have one).
Support more models.

Issues:

Higher ram usage because we need to load the onnx model for the input & output face for facial models.

#13243

yoni13 · 2025-01-11T13:32:04Z

Docker launch command:

docker run --security-opt systempaths=unconfined --security-opt apparmor=unconfined --device /dev/dri --device /dev/dma_heap --device /dev/rga --device /dev/mpp_service -v /cache:/cache:ro  -p 3004:3003 -v /sys/kernel/debug/:/sys/kernel/debug/:ro --name rknnimmich_name -d rknnimmich

and it works (if you download model to cache ofc)

~~ViT-B-32 and buffalo_l are loaded with two threads rerunning jobs: 2.7G, and peak 3.5G RAM. (its like running 4 models at the same time)~~
Update: this statistic is before we only load onnx when required, will update mem usage when I got time

mertalev

Nice work! This is a lot to get through and will need to go through more testing, but the core prediction logic looks pretty simple.

machine-learning/app/models/base.py

machine-learning/app/sessions/rknn.py

docker/hwaccel.ml.yml

docs/docs/features/ml-hardware-acceleration.md

machine-learning/rknn/rknnpool.py

mertalev · 2025-01-14T17:56:08Z

machine-learning/app/sessions/rknn.py

+    def _load_ort_session(self) -> None:
+        self.ort_session = ort.InferenceSession(
+            self.ort_model_path.as_posix(),
+        )
+        self.inputs: list[SessionNode] = self.ort_session.get_inputs()
+        self.outputs: list[SessionNode] = self.ort_session.get_outputs()
+        del self.ort_session
+
+    def get_inputs(self) -> list[SessionNode]:
+        try:
+            return self.inputs
+        except AttributeError:
+            self._load_ort_session()
+            return self.inputs
+
+    def get_outputs(self) -> list[SessionNode]:
+        try:
+            return self.outputs
+        except AttributeError:
+            self._load_ort_session()
+            return self.outputs


Just raise NotImplementedError for now and change the recognition code to first check if ORT is being used before calling get_inputs.

Just a quick question—are we not running recognition on the NPU?

machine-learning/app/config.py

machine-learning/Dockerfile

mertalev · 2025-01-14T18:02:08Z

machine-learning/app/sessions/rknn.py

+        run_options: Any = None,
+    ) -> list[NDArray[np.float32]]:
+        input_data: list[NDArray[np.float32]] = [np.ascontiguousarray(v) for v in input_feed.values()]
+        self.rknnpool.put(input_data)


Have you tested that pool with different job concurrency and TPE settings and verified that there are no errors and that the results don't change based on these settings?

I just met one😅 , but unsure where is the limit
It should be added to the docs

model: XLM-Roberta-Large-Vit-B-16Plus
visual tpe:2 running
error while trying to load the textual model (1)
concurrency: all default

I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33) I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8 I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2 E RKNN: [09:50:25.353] Import rknn model failed! E RKNN: [09:50:25.353] rknn_init, load model failed!

And some looks like rknn doesnt support some operator

E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm

We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now

I just met one😅 , but unsure where is the limit

It should be added to the docs

model: XLM-Roberta-Large-Vit-B-16Plus

visual tpe:2 running

error while trying to load the textual model (1)

concurrency: all default

I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33) I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8 I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2 E RKNN: [09:50:25.353] Import rknn model failed! E RKNN: [09:50:25.353] rknn_init, load model failed!

How much RAM do you have? It could be a simple case of not enough memory.

And some looks like rknn doesnt support some operator

E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm

We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now

It's normal for it to not support every op. It can just fall back to ONNX for those models like ARM NN does.

I just met one😅 , but unsure where is the limit

It should be added to the docs

model: XLM-Roberta-Large-Vit-B-16Plus

visual tpe:2 running

error while trying to load the textual model (1)

concurrency: all default

I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33) I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8 I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2 E RKNN: [09:50:25.353] Import rknn model failed! E RKNN: [09:50:25.353] rknn_init, load model failed!

How much RAM do you have? It could be a simple case of not enough memory.

I have a 8 gig model of orange pi 3b
And the mem usage

I still have some memory available , so I don't think that's the case(?

Up one means cache/buffer
Down one means usage

And some looks like rknn doesnt support some operator

E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm

We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now

It's normal for it to not support every op. It can just fall back to ONNX for those models like ARM NN does.

But there's an issue, it won't throw an error while converting the model, just saying "ooo not supported blah blah blah" in the logs

Maybe I should wrote a bash script automating it?

machine-learning/rknn/export/build_rknn.py

yoni13 and others added 16 commits November 29, 2024 07:42

untested

8ef3e49

test

6ffc227

lowercase

7fddf28

ViT-B-32__openai/textual/ Runs with emulator now.

bc849e2

Merge branch 'immich-app:main' into rknn-toolkit2

b6c4b37

Merge branch 'immich-app:main' into rknn-toolkit-lite2

4140e93

Init commit for using rknn, RecognitionFormDataLoadTest doesnt work

257cc6c

Merge branch 'immich-app:main' into rknn-toolkit-lite2

da152bd

Merge branch 'immich-app:main' into rknn-toolkit-lite2

082c426

all infrencing works with 1 max job concurrency

a94fad5

Merge branch 'immich-app:main' into rknn-toolkit-lite2

8608b9c

Update rknn.py

9bc3e5b

fix inf,-inf with 2 concurrency

4d704e9

Revert my changes to dockerfiles

a2722e1

support for rknn.rknnpool.is_available

c20d110

Merge branch 'immich-app:main' into rknn-toolkit-lite2

66004e3

github-actions bot added the 🧠machine-learning label Jan 11, 2025

yoni13 added 12 commits January 11, 2025 15:19

Handling Import and file not found Error for non-arm devices.

d10147f

Set group RKNN to optional

d5ef821

Dockerfile for rknn

506ca0d

Remove unused imports.

7aaf3aa

Indentation issue

f4671f4

Fix typo: rknnlite.api

7f2af6f

ruff format

d5e453a

ruff

23d0ea0

Check if NPU drivers is loaded or not.

4162119

Install onnxruntime

815ed1a

Should Fix No module named 'rknn'

807111e

add rknn to src

665718b

yoni13 and others added 9 commits January 13, 2025 18:37

ignore rknn model if not using it

b6cc205

reformat

6c4e6cb

add test,founds bugs, fix it tomorrow

4b0f93c

fixed some bugs

8b80d03

black app export

5244ed6

Merge branch 'main' into rknn-toolkit-lite2

cb01a11

switch to Runtime error instead of exit()

c21ce40

remove non implemented tests

0f03f77

this duplicated?

b5a4ed5

yoni13 marked this pull request as ready for review January 14, 2025 12:20

yoni13 requested review from mertalev and bo0tzz as code owners January 14, 2025 12:20

yoni13 added 2 commits January 15, 2025 00:28

trying to fix pytest

01eb095

Should FIx the quote that made mypy unhappy

9882b83

mertalev reviewed Jan 14, 2025

View reviewed changes

mertalev added the changelog:feature label Jan 14, 2025

yoni13 and others added 14 commits January 17, 2025 19:25

changes some cases

f32d991

add checksum for libnnrt.so

26d5fb0

switch to sha256

bc48b67

tpe

f067212

remove unrequired devices

0567592

fix granularity

3634ae1

fix typo and add a propper var name

f5de3de

make these functions snake case.

be76857

remove unnecessary print

87a46dc

refactor ignore_patterns

d7381ab

add a simple script to notify user if some op is not supported

9926045

Merge branch 'main' into rknn-toolkit-lite2

4e42fbc

fix typo in tests

b3ae5d3

shellcheck happy

d2b7e10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ml): ML on Rockchip NPUs #15241

feat(ml): ML on Rockchip NPUs #15241

yoni13 commented Jan 11, 2025 •

edited

Loading

yoni13 commented Jan 11, 2025 •

edited

Loading

mertalev left a comment

mertalev Jan 14, 2025

yoni13 Jan 18, 2025

mertalev Jan 14, 2025

yoni13 Jan 15, 2025 •

edited

Loading

yoni13 Jan 15, 2025 •

edited

Loading

mertalev Jan 15, 2025

mertalev Jan 15, 2025

yoni13 Jan 15, 2025 •

edited

Loading

yoni13 Jan 15, 2025

feat(ml): ML on Rockchip NPUs #15241

Are you sure you want to change the base?

feat(ml): ML on Rockchip NPUs #15241

Conversation

yoni13 commented Jan 11, 2025 • edited Loading

yoni13 commented Jan 11, 2025 • edited Loading

mertalev left a comment

Choose a reason for hiding this comment

mertalev Jan 14, 2025

Choose a reason for hiding this comment

yoni13 Jan 18, 2025

Choose a reason for hiding this comment

mertalev Jan 14, 2025

Choose a reason for hiding this comment

yoni13 Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

yoni13 Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

mertalev Jan 15, 2025

Choose a reason for hiding this comment

mertalev Jan 15, 2025

Choose a reason for hiding this comment

yoni13 Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

yoni13 Jan 15, 2025

Choose a reason for hiding this comment

yoni13 commented Jan 11, 2025 •

edited

Loading

yoni13 commented Jan 11, 2025 •

edited

Loading

yoni13 Jan 15, 2025 •

edited

Loading

yoni13 Jan 15, 2025 •

edited

Loading

yoni13 Jan 15, 2025 •

edited

Loading