[Hexagon] Enable Hexagon User DMA bypass mode #13381

adstraw · 2022-11-14T20:53:19Z

Enables Hexagon User DMA bypass mode based on user-specified dma_bypass_cache option for DMA copies between DDR and VTCM.

The upside of this change is increased DMA bandwidth (up to 40 GBps observed using test_vtcm_bandwidth.py) and compute throughput using a 3-stage pipeline --- cache read, compute, cache write (up to 38 Gops using test_parallel_hvx_load_vtcm.py).

The downside of this change is the potential for data coherency issues resulting from the need to manage the cache in software when using DMA bypass hence the user dma_bypass_cache option to enable or disable bypass mode.

The strategy to manage the cache in software centers around the requirement for Hexagon to operate on HexagonBuffer objects regardless of scope --- DDR or VTCM. When copying to / from a HexagonBuffer we aggressively invalidate the cache for both the source and destination, both before and after the copy. Also note that the copy is now implemented with memcpy instead of DMA. With the cache clean after copy to / from a HexagonBuffer we can now use DMA bypass mode. However, this software cache management strategy is NOT infallible --- if a HexagonBuffer becomes dirty in the cache prior to a DMA with bypass mode enabled we may see data coherency issues.

Also simplifies Hexagon DMA flows by removing the unused mem_copy instrinsic and lowering as well as the hexagon_user_dma_1d_sync helper function which is replaced by calls to HexagonUserDMA::Copy and HexagonUserDMA::Wait.

tvm-bot · 2022-11-14T20:53:22Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Icemist, @mehrdadh _{See #10317 for details}
Built docs for commit 2bab106 can be found here.

_{Generated by tvm-bot}

csullivan

🎉 I know this one looks like a small change but getting it right was a huge feat 💪. Can't wait to see the performance numbers in-situ. Many many thanks @adstraw! Just a couple small CRs.

src/runtime/hexagon/hexagon_buffer.cc

src/runtime/hexagon/hexagon_vtcm_pool.h

tests/python/contrib/test_hexagon/test_cache_read_write.py

src/runtime/hexagon/hexagon_buffer.cc

src/runtime/hexagon/hexagon_vtcm_pool.h

janetsc · 2022-11-14T22:04:54Z

tests/cpp-runtime/hexagon/hexagon_user_dma_tests.cc

+  // VTCM -> VTCM
+  // NOTE: Cache bypass is disabled for VTCM -> VTCM transfers
+  ret =
+      user_dma->Copy(queue_id, vtcm2hb.GetPointer(), vtcm1hb.GetPointer(), length, DISABLE_BYPASS);


Since this is the enabled test, should it be ENABLE_BYPASS?

Edit - Ah, I see the comment. But you should be able to say ENABLED, and have it still work. It just won't actually enable anything. Or, should it through if you attempt to set bypass if it is vtcm to vtcm?

User can "enable" or "disable" bypass and it will still work in all cases. Here is how the code operates:

Sets bypass ON for VTCM -> DDR or DDR -> VTCM when the user enables bypass

Sets bypass OFF for VTCM -> VTCM or DDR -> DDR regardless of whether the user enables bypass

Not sure if we should a) assert in case (2) when the user enables bypass or b) just silently turn bypass OFF. Currently I opt for (b) but I am open to feedback.

csullivan

LGTM! Thank you @adstraw!

Enables Hexagon User DMA bypass mode based on user-specified dma_bypass_cache option for DMA copies between DDR and VTCM. The upside of this change is increased DMA bandwidth (up to 40 GBps observed using test_vtcm_bandwidth.py) and compute throughput using a 3-stage pipeline --- cache read, compute, cache write (up to 38 Gops using test_parallel_hvx_load_vtcm.py). The downside of this change is the potential for data coherency issues resulting from the need to manage the cache in software when using DMA bypass hence the user dma_bypass_cache option to enable or disable bypass mode. The strategy to manage the cache in software centers around the requirement for Hexagon to operate on HexagonBuffer objects regardless of scope --- DDR or VTCM. When copying to / from a HexagonBuffer we aggressively invalidate the cache for both the source and destination, both before and after the copy. Also note that the copy is now implemented with memcpy instead of DMA. With the cache clean after copy to / from a HexagonBuffer we can now use DMA bypass mode. However, this software cache management strategy is NOT infallible --- if a HexagonBuffer becomes dirty in the cache prior to a DMA with bypass mode enabled we may see data coherency issues. Also simplifies Hexagon DMA flows by removing the unused mem_copy instrinsic and lowering as well as the hexagon_user_dma_1d_sync helper function which is replaced by calls to HexagonUserDMA::Copy and HexagonUserDMA::Wait. * restore vtcm tests; add TODO for ION buffer; check IsVtcm pointers

[Hexagon] Enable Hexagon User DMA bypass mode

d08330a

fix python lint errors

f49ea46

csullivan requested changes Nov 15, 2022

View reviewed changes

src/runtime/hexagon/hexagon_buffer.cc Show resolved Hide resolved

src/runtime/hexagon/hexagon_vtcm_pool.h Show resolved Hide resolved

tests/python/contrib/test_hexagon/test_cache_read_write.py Show resolved Hide resolved

janetsc reviewed Nov 15, 2022

View reviewed changes

restore vtcm tests; add TODO for ION buffer; check IsVtcm pointers

18ef71f

adstraw mentioned this pull request Nov 15, 2022

[TIR][Hexagon] Add vtcm memory capacity verification for Hexagon target #13349

Merged

adstraw force-pushed the straw-hex-dma-bypass branch from 1b95909 to bd75db2 Compare November 15, 2022 20:22

fix py lint errors; reset test-only changes

21063ff

adstraw force-pushed the straw-hex-dma-bypass branch from bd75db2 to 21063ff Compare November 15, 2022 22:27

csullivan approved these changes Nov 16, 2022

View reviewed changes

Merge remote-tracking branch 'upstream/main' into straw-hex-dma-bypass

2bab106

csullivan merged commit 14342a3 into apache:main Nov 16, 2022

adstraw deleted the straw-hex-dma-bypass branch November 17, 2022 00:33

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hexagon] Enable Hexagon User DMA bypass mode #13381

[Hexagon] Enable Hexagon User DMA bypass mode #13381

adstraw commented Nov 14, 2022

tvm-bot commented Nov 14, 2022 •

edited

Loading

csullivan left a comment •

edited

Loading

janetsc Nov 14, 2022

adstraw Nov 15, 2022 •

edited

Loading

csullivan left a comment

[Hexagon] Enable Hexagon User DMA bypass mode #13381

[Hexagon] Enable Hexagon User DMA bypass mode #13381

Conversation

adstraw commented Nov 14, 2022

tvm-bot commented Nov 14, 2022 • edited Loading

csullivan left a comment • edited Loading

Choose a reason for hiding this comment

janetsc Nov 14, 2022

Choose a reason for hiding this comment

adstraw Nov 15, 2022 • edited Loading

Choose a reason for hiding this comment

csullivan left a comment

Choose a reason for hiding this comment

tvm-bot commented Nov 14, 2022 •

edited

Loading

csullivan left a comment •

edited

Loading

adstraw Nov 15, 2022 •

edited

Loading