-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ml): ML on Rockchip NPUs #15241
base: main
Are you sure you want to change the base?
Conversation
Docker launch command:
and it works (if you download model to cache ofc)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! This is a lot to get through and will need to go through more testing, but the core prediction logic looks pretty simple.
def _load_ort_session(self) -> None: | ||
self.ort_session = ort.InferenceSession( | ||
self.ort_model_path.as_posix(), | ||
) | ||
self.inputs: list[SessionNode] = self.ort_session.get_inputs() | ||
self.outputs: list[SessionNode] = self.ort_session.get_outputs() | ||
del self.ort_session | ||
|
||
def get_inputs(self) -> list[SessionNode]: | ||
try: | ||
return self.inputs | ||
except AttributeError: | ||
self._load_ort_session() | ||
return self.inputs | ||
|
||
def get_outputs(self) -> list[SessionNode]: | ||
try: | ||
return self.outputs | ||
except AttributeError: | ||
self._load_ort_session() | ||
return self.outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just raise NotImplementedError
for now and change the recognition code to first check if ORT is being used before calling get_inputs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick question—are we not running recognition on the NPU?
run_options: Any = None, | ||
) -> list[NDArray[np.float32]]: | ||
input_data: list[NDArray[np.float32]] = [np.ascontiguousarray(v) for v in input_feed.values()] | ||
self.rknnpool.put(input_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested that pool with different job concurrency and TPE settings and verified that there are no errors and that the results don't change based on these settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just met one😅 , but unsure where is the limit
It should be added to the docs
model: XLM-Roberta-Large-Vit-B-16Plus
visual tpe:2 running
error while trying to load the textual model (1)
concurrency: all default
I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33)
I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8
I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape
E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory
E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2
E RKNN: [09:50:25.353] Import rknn model failed!
E RKNN: [09:50:25.353] rknn_init, load model failed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And some looks like rknn doesnt support some operator
E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm
We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just met one😅 , but unsure where is the limit
It should be added to the docs
model: XLM-Roberta-Large-Vit-B-16Plus
visual tpe:2 running
error while trying to load the textual model (1)
concurrency: all default
I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33) I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8 I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2 E RKNN: [09:50:25.353] Import rknn model failed! E RKNN: [09:50:25.353] rknn_init, load model failed!
How much RAM do you have? It could be a simple case of not enough memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And some looks like rknn doesnt support some operator
E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm
We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now
It's normal for it to not support every op. It can just fall back to ONNX for those models like ARM NN does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just met one😅 , but unsure where is the limit
It should be added to the docs
model: XLM-Roberta-Large-Vit-B-16Plus
visual tpe:2 running
error while trying to load the textual model (1)
concurrency: all default
I RKNN: [09:50:24.865] RKNN Runtime Information, librknnrt version: 2.3.0 (c949ad889d@2024-11-07T11:35:33) I RKNN: [09:50:24.867] RKNN Driver Information, version: 0.9.8 I RKNN: [09:50:24.872] RKNN Model Information, version: 6, toolkit version: 2.3.0(compiler version: 2.3.0 (c949ad889d@2024-11-07T11:39:30)), target: RKNPU lite, target platform: rk3566, framework name: ONNX, framework layout: NCHW, model inference type: static_shape E RKNN: [09:50:25.353] failed to allocate handle, ret: -1, errno: 12, errstr: Cannot allocate memory E RKNN: [09:50:25.353] failed to malloc npu memory, size: 1132926656, flags: 0x2 E RKNN: [09:50:25.353] Import rknn model failed! E RKNN: [09:50:25.353] rknn_init, load model failed!
How much RAM do you have? It could be a simple case of not enough memory.
I have a 8 gig model of orange pi 3b
And the mem usage
I still have some memory available , so I don't think that's the case(?
Up one means cache/buffer
Down one means usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And some looks like rknn doesnt support some operator
E RKNN: [11:18:23.623] Unsupport CPU op: CumSum in this librknnrt.so, please try to register custom op by calling rknn_register_custom_ops or If using rknn, update to the latest toolkit2 and runtime from: https://console.zbox.filez.com/l/I00fc3 (PWD: rknn). If using rknn-llm, update from: /~https://github.com/airockchip/rknn-llm
We can register custom_ops according to RKNPU user-guide 5.5.2.1, but probably not now
It's normal for it to not support every op. It can just fall back to ONNX for those models like ARM NN does.
But there's an issue, it won't throw an error while converting the model, just saying "ooo not supported blah blah blah" in the logs
Maybe I should wrote a bash script automating it?
Goals: ML on Rockchip NPUs.
Testing on board: #13243 (reply in thread)
TODO:
immich-app/ViT-B-32__openai/textual/rk3566.rknn
) .Nice to have:
Issues:
#13243