-
Notifications
You must be signed in to change notification settings - Fork 6.8k
MxNet R, CNN, VRAM consumption explodes dramatically in dependence of number of filters #10721
Comments
@thomasmooon maybe you can test whether #11374 effectively solves this RAM comsumtionissue? |
@thomasmooon I just ran your example with @nswamy Can you close this issue? |
thanks @jeremiedb |
@jeremiedb I was on vacancy leave and just read your posts. Thanks for your suggestion. But in the meanwhile, a few weeks after I opened the issue, I switched to another DL framework for several reasons. |
@thomasmooon Sure I understand as the support for R-package hasn't been great. May I ask you if there were other specific features you were seeing as lacking? Thanks! |
@jeremiedb Well, in general my experience is that a better documentation is desirable. Especially with minimum reproducible runnable examples for R for each layer / method. Hence, if I would restart with MXNet I'd first learn python and then use the MXNet Python. This doesn't answer your "specific feature" question, there were / are a lot of small things in my use cases demanding hacking a lot around using MXNet whilst in my framework of current choice this is not the case. |
Description
I have a toy dataset of 360 samples with 4096 data points each, leading to a tensor of shape
(4096,1,360)
. Hence, each observation has a size of ~ 4 kB. The CNN is very simple:Conv -> flatten -> fully connected -> fully connected -> softmax
:The VRAM consumption explodes in dependence of the number of filters: Please see the table and the related picture below. Regarding the influence of the kernel size and the batch size: These have very small influence, I've tested several combinations, but I omit these details for now. The tables measure a setting using 2 GPUs of my environment (described in the environment setting below). As one can see the VRAM demand of each card increases, as expected, linear with the number of convolution filters. But if it exceeds 10, then the GPUs run out of their 8 GB VRAM. What the hell...?
It is also remarkable, that in a setting with 1 GPU and 8 kernels is not possible: It exhausts the 8 GB RAM of the single Card. But using 2 GPUs with everything else unchanged, then each GPU consumes only 0.477 GB, so 2x0.477 = 0.95 GB in total. This is far beyond of what is consumed when using only 1 Card. How can this be??
Things else tested without any effect: The argument
workspace
in themx.symbol.Convolution()
-Function. I played with several values: 1, 64, 128, 512 MB. But his had absolutely none effect disregarding to any combination of varying number of filters. Here's the defintion ofworkspace
:#VRAM consumption in dependence of the number of filters, using 2 GPUs
In addition I measured the RAM consumption if the device is CPU, hence no usage of GPUs. I tried values of 10, 11 and 20 filters. What you can see is, that the RAM consumption increases linear, especially when increasing vom 10 to 11, rather than exploding if the device are GPUs. This is confusing. In addition, the RAM consumption using 10 filters is 9 GB, in alignment with the observation, that the VRAM of 8 GB of one GPU is insufficient. But, again, in contradiction to the 0.95 GB if 2 GPUs are used.
For R user, please provide R
sessionInfo()
:R version 3.4.3 (2017-11-30)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] bindrcpp_0.2 mxnet_0.10.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.12 compiler_3.4.3 RColorBrewer_1.1-2 influenceR_0.1.0
[5] plyr_1.8.4 bindr_0.1 viridis_0.4.0 tools_3.4.3
[9] digest_0.6.12 jsonlite_1.5 tibble_1.3.3 gtable_0.2.0
[13] viridisLite_0.2.0 rgexf_0.15.3 pkgconfig_2.0.1 rlang_0.1.1
[17] igraph_1.1.2 rstudioapi_0.6 yaml_2.1.14 gridExtra_2.2.1
[21] DiagrammeR_0.9.0 dplyr_0.7.2 stringr_1.2.0 htmlwidgets_0.9
[25] grid_3.4.3 glue_1.1.1 R6_2.2.2 Rook_1.1-1
[29] XML_3.98-1.9 ggplot2_2.2.1 magrittr_1.5 codetools_0.2-15
[33] scales_0.4.1 htmltools_0.3.6 assertthat_0.2.0 colorspace_1.3-2
[37] brew_1.0-6 stringi_1.1.5 visNetwork_2.0.0 lazyeval_0.2.0
[41] munsell_0.4.3
Hardware
8 x 1080 TI
60 GB RAM
12 Cores
cuda version
Minimum reproducible example
Steps to reproduce
Comment / uncomment the lines in the section
and use
nvidia-smi -l 3
to monitor memory consumption. I recommend to run script not in R, rather from shell for convenience (R will crash when VRAM exceeds).To measure the RAM consumption using CPU, comment content in this section and monitor e. g. with
htop
What have you tried to solve it?
Varied these parameters:
workspace
: 1, 64, 128, 512, 1024 MBThe text was updated successfully, but these errors were encountered: