CUDA providers failed to build against 12.6 with error error #221-D #22728

egortech · 2024-11-05T08:04:15Z

Describe the issue

CUDA providers failed to build against 12.6 with error error #221-D.

Urgency

No response

Target platform

Windows 11

Build script

./build.bat --config RelWithDebInfo --use_openvino AUTO:GPU,CPU --use_cuda --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6" --use_tensorrt --tensorrt_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\TensorRT-10.6.0.26" --build_shared_lib --cuda_version "12.6" --cmake_generator "Visual Studio 17 2022" --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES='50;52;61;70;72;75;80;86;87;89;90' --cudnn_home "C:\Program Files\NVIDIA\CUDNN" --parallel --cmake_extra_defines CUDNN_INCLUDE_DIR="C:\Program Files\NVIDIA\CUDNN\include" CUDNN_LIBRARY="C:\Program Files\NVIDIA\CUDNN\lib\x64\cudnn.lib"

Error / output

E:\src\Microsoft\onnxruntime\onnxruntime\core/providers/cuda/shared_inc/cuda_utils.h(159): error #221-D: floating-point value does not fit in required floating-point type [E:\src\Microsoft\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_providers_cuda.vcxproj]
        return ((float)(1e+300));
                ^

E:\src\Microsoft\onnxruntime\onnxruntime\core/providers/cuda/shared_inc/cuda_utils.h(166): error #221-D: floating-point value does not fit in required floating-point type [E:\src\Microsoft\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_providers_cuda.vcxproj]
        return -((double)((float)(1e+300)));
                          ^

E:\src\Microsoft\onnxruntime\onnxruntime\core/providers/cuda/shared_inc/cuda_utils.h(169): error #221-D: floating-point value does not fit in required floating-point type [E:\src\Microsoft\onnxruntime\build\Windows\RelWithDebInfo\onnxruntime_providers_cuda.vcxproj]
        return ((double)((float)(1e+300)));
                         ^

  4 errors detected in the compilation of "E:/src/Microsoft/onnxruntime/onnxruntime/contrib_ops/cuda/sparse/sparse_attention_impl.cu".
  sparse_attention_impl.cu

Visual Studio Version

Visual Studio 2022 v17.12 Preview 5

GCC / Compiler Version

No response

The text was updated successfully, but these errors were encountered:

snnn · 2024-11-05T15:24:50Z

I can confirm the problem exists. We should directly use std::numeric_limits instead. I tried to update the code but I cannot understand it. It mix uses of INFINITY and max. It also uses -INFINITY which usually is not meanful. @tianleiwu / @yufenglee, do you have time to take a look?

snnn · 2024-11-05T15:25:23Z

PR #13594

tianleiwu · 2024-11-05T18:38:15Z

Thanks for reporting. Let me create a PR to fix it.

snnn · 2024-11-05T19:24:08Z

Thank you @tianleiwu

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context #22728

alando46 · 2024-11-16T22:58:22Z

i am still experiencing this on main:latest with the build params:
'.\\build.bat', '--config', 'Release', '--skip_submodule_sync', '--update', '--build', '--build_shared_lib', '--skip_tests', '--parallel', '--use_cuda', '--cuda_version', '12.6', '--cuda_home', 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.6', '--cmake_extra_defines', 'CMAKE_CUDA_ARCHITECTURES=89', '--cudnn_home', 'C:\\Program Files\\NVIDIA\\CUDNN\\v9.5'

Oddly, I have the same issues with older versions of CUDA and CUDNN. I get identical failures with the command:
'.\\build.bat', '--config', 'Release', '--skip_submodule_sync', '--update', '--build', '--build_shared_lib', '--skip_tests', '--parallel', '16', '--nvcc_threads', '8', '--use_cuda', '--cuda_version', '12.2', '--cuda_home', 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.2', '--cmake_extra_defines', 'CMAKE_CUDA_ARCHITECTURES=89', '--cudnn_home', 'C:\\Program Files\\NVIDIA\\CUDNN\\v9.3'

No matter what version of CUDA 12.2 and greater and CUDNN 9.3 and greater I experience the same errors. Per the documentation , I moved all newer CUDA X.Y files from C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations so only the 12.2 version remains. As a side note, it might make sense to append a second path on the documentation for Community edition users.

Across all builds

MSVC 14.38.33130
Selecting Windows SDK version 10.0.26100.0 to target Windows 10.0.22631.
cmake version 3.29.3

it's all error 221-D. These are the first few of many:

C:\....\ort\onnxruntime\contrib_ops\cuda\bert\ngram_repeat_block_impl.cu(51): error #221-D: floating-point value does not fit in required floating-point type [C:\....\ort\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]

C:\....\ort\onnxruntime\contrib_ops/cuda/bert/flash_attention/softmax.h(102): error #221-D: floating-point value does not fit in required floating-point type [C:\....\ort\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]

C:\....\ort\onnxruntime\contrib_ops/cuda/bert/flash_attention/mask.h(184): error #221-D: floating-point value does not fit in required floating-point type [C:\....\ort\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]

C:\....\ort\onnxruntime\contrib_ops/cuda/bert/flash_attention/softmax.h(146): error #221-D: floating-point value does not fit in required floating-point type [C:\....\ort\build\Windows\Release\onnxruntime_providers_cuda.vcxproj]

I am trying to think through what the issue could be. I remove the build folder before any new builds. thanks @tianleiwu and @snnn for your support on this bug!

tianleiwu · 2024-11-17T23:29:56Z

@alando46, it is known that Visual Studio 2022 v17.12 Preview 5 uses (float)(1e+300) as INFINITY, which causes compiler error. Walkaround is to use v17.11.x stable version. I could add a pull request to replace INFINITY in the kernel code later.

Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context #22728

alando46 · 2024-11-18T20:18:34Z

thanks @tianleiwu, i can confirm i can now build both cuda 12.2/cudnn 9.3 and cuda 12.6/cudnn 9.5, using MSVC 14.38.33130 (project requirement). Much appreciated!

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context microsoft#22728

Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context #22728

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context #22728

Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context #22728

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context microsoft#22728

…#22868) Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context microsoft#22728

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context microsoft#22728

…#22868) Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context microsoft#22728

### Description * Fix `NumericLimits<float>` that used infinity as max, which is not consistent with `std::numeric_limits<float>::max()` In Windows, (float)(1e+300) is used for INFINITY, which causes compiler error in Visual Studio 2022 v17.12 Preview 5. * Rename `NumericLimits<T>::Min` to Lowest to be consistent with std::numeric_limits * Fix topk implementation: use `NumericLimits<CudaT>` instead of `NumericLimits<T>` in kernel. That could avoid defining a confusing defintion of `NumericLimits<MLFloat16>` that returns half instead of MLFloat16. * Use CUDART_MAX_NORMAL_FP16 if possible. It sets bits value directly, which is faster than converting float to half. Note that NumericLimits does not support __nv_bfloat16 and _nv_fp8_e4m3 and __nv_fp8_e5m2 right now. ### Motivation and Context #22728

Replace INFINITY by `std::numeric_limits<float>::infinity()` to avoid build errors with Visual Studio 2022 v17.12 Preview 5 ### Motivation and Context #22728

egortech added the build build issues; typically submitted using template label Nov 5, 2024

egortech changed the title ~~[Build]~~ CUDA providers failed to build against 12.6 with error error #221-D. [Build] Nov 5, 2024

egortech changed the title ~~CUDA providers failed to build against 12.6 with error error #221-D. [Build]~~ CUDA providers failed to build against 12.6 with error error #221-D Nov 5, 2024

github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Nov 5, 2024

tianleiwu mentioned this issue Nov 5, 2024

[CUDA] Fix NumericLimits #22738

Merged

tianleiwu mentioned this issue Nov 18, 2024

Replace INFINITY by std::numeric_limits<float>::infinity() #22868

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA providers failed to build against 12.6 with error error #221-D #22728

CUDA providers failed to build against 12.6 with error error #221-D #22728

egortech commented Nov 5, 2024 •

edited

Loading

snnn commented Nov 5, 2024 •

edited

Loading

snnn commented Nov 5, 2024

tianleiwu commented Nov 5, 2024

snnn commented Nov 5, 2024

alando46 commented Nov 16, 2024 •

edited

Loading

tianleiwu commented Nov 17, 2024

alando46 commented Nov 18, 2024 •

edited

Loading

CUDA providers failed to build against 12.6 with error error #221-D #22728

CUDA providers failed to build against 12.6 with error error #221-D #22728

Comments

egortech commented Nov 5, 2024 • edited Loading

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

snnn commented Nov 5, 2024 • edited Loading

snnn commented Nov 5, 2024

tianleiwu commented Nov 5, 2024

snnn commented Nov 5, 2024

alando46 commented Nov 16, 2024 • edited Loading

tianleiwu commented Nov 17, 2024

alando46 commented Nov 18, 2024 • edited Loading

egortech commented Nov 5, 2024 •

edited

Loading

snnn commented Nov 5, 2024 •

edited

Loading

alando46 commented Nov 16, 2024 •

edited

Loading

alando46 commented Nov 18, 2024 •

edited

Loading