Add CUDA acceleration #933

SamCarlberg · 2019-04-10T02:08:10Z

Overview

Adds CUDA support to the GRIP runtime. A CUDA-enabled build will set a flag in MANIFEST.MF to let the runtime know that it is is using OpenCV builds with CUDA extensions; this is required since the CUDA-enabled OpenCV binaries cannot load if there is no compatible CUDA runtime installed on the system.

Add a MatWrapper class for semi-transparently working with an image on CPU and GPU memory. Data is lazily copied between host and device when needed.

Add fixed exit codes for use with SafeShutdown, to keep the codes organized. Add an exit code for missing CUDA rutnime for CUDA-enabled builds.

Add a flagChanged method to OutputSocket to avoid hacky calls to socket.setValue(socket.getValue().get())

Add a partial OutputSocket implementation with CudaSocket - contains a boolean value to flag if CUDA acceleration is available. Will always contain false if CUDA is not available, either as a missing runtime or if a CPU-only build of OpenCV is used. Displayed as a regular checkbox, which will be disabled if CUDA is unavailable.

TODO

Windows CUDA runtime detection
Mac CUDA runtime detection
Linux CUDA runtime detection
Confirm expected behavior of CUDA-enabled builds on systems without CUDA
Investigate more usecases for CUDA acceleration (eg cascade classifiers)

Updated Operations with CUDA Support

BlurOperation: Can CUDA-accelerate Gaussian blurs and bilateral filters. Box and median blurs are still CPU-only. Bilateral filter is 10x-100x faster in CUDA depending on blur radius
CannyEdgeOperation
DesaturateOperation
NormalizeOperation (single-channel inputs only)
CV Absdiff
CV add
CV addWeighted
CV bitwise_and
CV bitwise_not
CV bitwise_or
CV bitwise_xor
CV compare
CV subtract
CV cvtColor
CV Sobel (~15x speedup versus CPU)
CV Threshold

Note: speedups of CPU versus CUDA are as measured on my desktop computer

Component	Name	Speed
CPU	i7-7700k	4800MHz
RAM	2x16GB DDR4	2133MHz
GPU	GTX 980Ti	1450MHz core/3750MHz mem

Re-enable CompatibilityTest. Fix some backwards compatbility issues. Not really a fan of injecting the Injector into the Operations class. Need to see if there's a better solution

Currently requires custom CUDA bindings from javaccp-presets (see bytedeco/javacpp-presets#416) Trying to use CUDA-accelerated operations without an nvidia GPU and running nvidia drivers will probably crash the app

ProjectTest.testPerformSerializedPipelineWithMats is broken... don't know exactly why, but opencv_core.compare is segfauting

No more socket.setValue(socket.getValue()) hacks

Or -PWITH_CUDA

Socket is disabled when CUDA is unavailable

Add enumeration of exit codes to keep things organized

CUDA median filter is broken on 3.4.3, so keep it running on CPU only

Slight speedup versus CPU

Makes CUDA operation cleanups actually free the used memory

JLLeitschuh · 2019-04-10T14:41:07Z

core/core.gradle.kts

+
+if (withCuda) {
+    version = "$version-cuda"
+}


Is the plan to just build a different version of GRIP with CUDA support?

Yeah. CUDA-enabled OpenCV will fail to load if there's no CUDA runtime available, and I'm not sure if it's possible to trick JavaCPP into loading the non-CUDA version at runtime

Classloader hackyness? A thought you could experiment with after this PR is merged.

JLLeitschuh · 2019-04-10T14:42:35Z

core/src/main/java/edu/wpi/grip/core/MatWrapper.java

+ * accessed from CPU land when they have been most recently used from CUDA code.
+ */
+@SuppressWarnings("PMD.GodClass")
+public final class MatWrapper {


Is this class thread safe? Does it need to be?

No, and I don't think it needs to be. Mat is not thread safe and we've been fine so far with our existing synchronization constructs.

JLLeitschuh · 2019-04-10T14:43:57Z

core/src/main/java/edu/wpi/grip/core/MatWrapper.java

+    } else {
+      return ifGpu.apply(gpuMat);
+    }
+  }


👍 Clean & straightforward!

JLLeitschuh · 2019-04-10T17:23:36Z

core/src/main/java/edu/wpi/grip/core/sockets/CudaSocket.java

+ * CUDA-accelerated code path. If no compatible CUDA runtime is available, sockets of this type
+ * will <i>always</i> have a value of {@code false} and cannot be changed.
+ */
+public class CudaSocket extends InputSocketImpl<Boolean> {


CudaInputSocket?

JLLeitschuh · 2019-04-10T17:25:48Z

core/src/main/java/edu/wpi/grip/core/util/SafeShutdown.java

+     * CUDA is required by OpenCV but no compatible runtime is available on the system.
+     */
+    public static final int CUDA_UNAVAILABLE = 0x04;
+  }


Prefer an enum with code field on it over this.

core/src/test/java/edu/wpi/grip/core/serialization/ProjectTest.java

JLLeitschuh · 2019-04-10T17:29:21Z

So far, everything looks good. Poke me when you want me to take another pass.

This lets us create CudaDetectors etc. before loading OpenCV in the core module Move logger setup to its own class, since the core module may not load before app exits

Call flagChanged in ProjectTest

FIxes issue on Windows when using non-CUDA OpenCV

JLLeitschuh · 2019-04-11T13:34:14Z

core/src/main/java/edu/wpi/grip/core/Loggers.java

+import java.util.logging.SimpleFormatter;
+import java.util.logging.StreamHandler;
+
+public final class Loggers {


Solid change.

JLLeitschuh · 2019-04-11T13:36:36Z

core/src/main/java/edu/wpi/grip/core/GripCudaModule.java

+    }
+
+    bind(CudaVerifier.class).in(Scopes.SINGLETON);
+  }


Do you want to move this logic to a construction phase? Like a method GripCudaModule.create() method, that way there's no dynamic logic inside of configure?

Related: /~https://github.com/google/guice/wiki/AvoidConditionalLogicInModules

JLLeitschuh · 2019-04-11T13:37:29Z

core/src/test/java/edu/wpi/grip/core/cuda/CudaVerifierTest.java

+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+public class CudaVerifierTest {


IOC and composition makes this a pretty trivial class 🙂

Generate a resource file at build time to flag CUDA usage May also fix Mac, but that remains untested

Since there's no definite location to check, and because linking is funky on Windows (only seems to want CUDA 10.0; 10.1 cannot be used). Linux, of course, will work fine with 10.1

codecov-io · 2019-04-12T04:07:05Z

Codecov Report

Merging #933 into master will decrease coverage by 1.5%.
The diff coverage is 39.38%.

@@             Coverage Diff              @@
##             master     #933      +/-   ##
============================================
- Coverage     54.24%   52.73%   -1.51%     
  Complexity        1        1              
============================================
  Files           307      325      +18     
  Lines          8372     8853     +481     
  Branches        542      563      +21     
============================================
+ Hits           4541     4669     +128     
- Misses         3633     3979     +346     
- Partials        198      205       +7

core/core.gradle.kts

core/src/main/java/edu/wpi/grip/core/MatWrapper.java

Is platform-agnostic and doesn't need to hardcode install locations - CUDA just needs to be on the PATH

Inline requireNonNull

JLLeitschuh

LGTM!

SamCarlberg added 21 commits April 8, 2019 16:10

Autodiscover operations at startup. Use annotations for descriptions

0868182

Re-enable CompatibilityTest. Fix some backwards compatbility issues. Not really a fan of injecting the Injector into the Operations class. Need to see if there's a better solution

Add CUDA support

bb7afe1

Currently requires custom CUDA bindings from javaccp-presets (see bytedeco/javacpp-presets#416) Trying to use CUDA-accelerated operations without an nvidia GPU and running nvidia drivers will probably crash the app

Fix tests

60c5fe5

ProjectTest.testPerformSerializedPipelineWithMats is broken... don't know exactly why, but opencv_core.compare is segfauting

Add OutputSocket.flagChanged()

df1b684

No more socket.setValue(socket.getValue()) hacks

Leftover fixes

98fac34

Allow CUDA versions to be optionally specified with -Pcuda

bc6a2be

Or -PWITH_CUDA

Add detection of CUDA runtime

4619781

Add create() and put() overloads to MatWrapper

62e25d7

Add special CUDA socket

988d886

Socket is disabled when CUDA is unavailable

Exit when CUDA runtime is required but not present

57f948f

Add enumeration of exit codes to keep things organized

Minor cleanup

bca1c31

Remove CUDA-accelerated median blur

7c10d4f

CUDA median filter is broken on 3.4.3, so keep it running on CPU only

Make normalize operation CUDA-accelerated

df4df79

Slight speedup versus CPU

Add CUDA acceleration to basic CV operations

6a67f5f

Make wrappers use passed Mat objects instead of copying

c23a132

Makes CUDA operation cleanups actually free the used memory

MatWrapper docs

163b0f9

Remove synchronization from MatWrapper

68828cd

Fail fast if CUDA version.txt does not exist

87fddb9

Delete unused MatWrapperTest

d798f64

Clean up flagChanged in OutputSocketImpl

b652fda

Use try-with-resources in CUDA CannyEdgeOperation

0df4e26

SamCarlberg added type: enhancement wip labels Apr 10, 2019

SamCarlberg added 4 commits April 9, 2019 22:09

Append 'cuda' to versions of GRIP with CUDA acceleration

ce0ba87

Fix code generation test compile error

5ac6da7

Ignore CudaSockets when generating code

a825a6d

Include non-cv operations in OperationsUtil

e6c647b

JLLeitschuh reviewed Apr 10, 2019

View reviewed changes

core/src/main/java/edu/wpi/grip/core/MatWrapper.java

} else {

return ifGpu.apply(gpuMat);

}

}

Copy link

Member

JLLeitschuh Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Clean & straightforward!

JLLeitschuh reviewed Apr 10, 2019

View reviewed changes

core/src/test/java/edu/wpi/grip/core/serialization/ProjectTest.java Outdated Show resolved Hide resolved

SamCarlberg added 4 commits April 10, 2019 14:18

Add CUDA classes to dedicated cuda module

fdac7b4

This lets us create CudaDetectors etc. before loading OpenCV in the core module Move logger setup to its own class, since the core module may not load before app exits

Move flagChanged to Socket from OutputSocket

004df45

Call flagChanged in ProjectTest

Make SafeShutdown take an ExitCode enum instead of raw int

e67d362

Allow wrapping of existing CPU and GPU mats

0860629

FIxes issue on Windows when using non-CUDA OpenCV

JLLeitschuh reviewed Apr 11, 2019

View reviewed changes

SamCarlberg added 3 commits April 11, 2019 18:49

Fix CUDA detection on Windows

ad071fa

Generate a resource file at build time to flag CUDA usage May also fix Mac, but that remains untested

Check if CUDA JNI can be loaded to check for CUDA installs on Windows

319b4cc

Since there's no definite location to check, and because linking is funky on Windows (only seems to want CUDA 10.0; 10.1 cannot be used). Linux, of course, will work fine with 10.1

Checkstyle

5c218b2

JLLeitschuh reviewed Apr 12, 2019

View reviewed changes

core/core.gradle.kts Show resolved Hide resolved

JLLeitschuh reviewed Apr 12, 2019

View reviewed changes

core/src/main/java/edu/wpi/grip/core/MatWrapper.java Show resolved Hide resolved

SamCarlberg added 3 commits April 12, 2019 12:24

Use JNI loading to locate CUDA install instead of OS-specific files

001ea72

Is platform-agnostic and doesn't need to hardcode install locations - CUDA just needs to be on the PATH

Use WriteProperties task to save CUDA runtime properties

f7ec114

Remove unnecessary MatWrapper constructors

4e389d2

Inline requireNonNull

SamCarlberg marked this pull request as ready for review April 16, 2019 20:21

JLLeitschuh removed the wip label Apr 17, 2019

JLLeitschuh approved these changes Apr 17, 2019

View reviewed changes

SamCarlberg merged commit 34afc3d into WPIRoboticsProjects:master Apr 17, 2019

SamCarlberg deleted the experiemental/cuda branch April 17, 2019 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CUDA acceleration #933

Add CUDA acceleration #933

SamCarlberg commented Apr 10, 2019 •

edited

Loading

JLLeitschuh Apr 10, 2019

SamCarlberg Apr 10, 2019

JLLeitschuh Apr 10, 2019

JLLeitschuh Apr 10, 2019

SamCarlberg Apr 10, 2019

JLLeitschuh Apr 10, 2019

JLLeitschuh Apr 10, 2019

JLLeitschuh Apr 10, 2019

JLLeitschuh commented Apr 10, 2019

JLLeitschuh Apr 11, 2019

JLLeitschuh Apr 11, 2019

JLLeitschuh Apr 11, 2019

SamCarlberg Apr 15, 2019

codecov-io commented Apr 12, 2019 •

edited

Loading

JLLeitschuh left a comment

Add CUDA acceleration #933

Add CUDA acceleration #933

Conversation

SamCarlberg commented Apr 10, 2019 • edited Loading

Overview

TODO

Updated Operations with CUDA Support

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JLLeitschuh commented Apr 10, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Apr 12, 2019 • edited Loading

Codecov Report

JLLeitschuh left a comment

Choose a reason for hiding this comment

SamCarlberg commented Apr 10, 2019 •

edited

Loading

codecov-io commented Apr 12, 2019 •

edited

Loading