Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curand switch #873

Merged
merged 3 commits into from
Oct 26, 2022
Merged

Curand switch #873

merged 3 commits into from
Oct 26, 2022

Conversation

Robadob
Copy link
Member

@Robadob Robadob commented Jun 17, 2022

  • Move curand behind a typedef so it can be swapped at compile time (and forward the same to RTC builds)
  • Run the test suite for the two new curand options
  • Add a CMake option to swap curand
  • Change the default to Philox (Counter-based PRNGs #362 ?)

Also fixed the warnings in test suite which @MILeach said they'd do weeks ago. (Removed in rebase, as It had been resolved elsewhere)
Also added USE_GLM status to the RTC short reference as this had previously been missed.

Not assigning this to myself, someone else may pick it up to finish before me.

@Robadob Robadob linked an issue Jun 17, 2022 that may be closed by this pull request
2 tasks
@Robadob

This comment was marked as outdated.

@ptheywood
Copy link
Member

I'm not super keen on adding more library macros, which will make installing flamegpu system wide / providing pre-compiled binaries less flexible, but the required templating will probably be more painful than not.

Though potentially as there are only 3 accepted curand types this could just be explicitly instantiated, with the macro required / used at model compilation instead?

With the philox 128byte versions, I'd probably rather make the api methods available for the other generators too so they models do not need specialising based on the value of the compile time macro (i.e. if someone needs 4 floats at once, they can just use the float4 version and philox will just run faster, where as the other generators will just be called 4 times. This will probably eat a few more registers, but shouldn't be too expensive)

@Robadob
Copy link
Member Author

Robadob commented Jun 20, 2022

I'm not super keen on adding more library macros, which will make installing flamegpu system wide / providing pre-compiled binaries less flexible, but the required templating will probably be more painful than not.

This is more a case of a hyper-advanced setting where we choose a default and don't expect anyone to ever have a reason to change it. Happy to leave it undocumented tbh.

Though potentially as there are only 3 accepted curand types this could just be explicitly instantiated, with the macro required / used at model compilation instead?

This just doesn't feel viable, same way we haven't included agent out state as a template arg to DeviceAPI.

With the philox 128byte versions, I'd probably rather make the api methods available for the other generators too so they models do not need specialising based on the value of the compile time macro (i.e. if someone needs 4 floats at once, they can just use the float4 version and philox will just run faster, where as the other generators will just be called 4 times. This will probably eat a few more registers, but shouldn't be too expensive)

This seems fair.

@Robadob
Copy link
Member Author

Robadob commented Jun 20, 2022

Now getting Link errors when I try to build with Philox. They don't make much sense, and I cba to dig into it right now. Pete says build's fine for him on Linux.

This allows the backing engine to be swapped with a preprocessor definition

Have ran full test suite with Philox, and AgentRandom suite with MRG32
@Robadob Robadob marked this pull request as ready for review October 20, 2022 09:28
@Robadob Robadob force-pushed the curand_switch branch 2 times, most recently from c023495 to 4f358e8 Compare October 20, 2022 11:20
@Robadob
Copy link
Member Author

Robadob commented Oct 20, 2022

Have now built and ran test suite for each of the 3 curand engine's consecutively. All worked as expected.

When running pyflamegpu test suite, it crashes with a fatal error (see end of this comment). However, that error is also currently present in the master pyflamegpu (windows/release) suite too. Hence, I consider this PR ready.

This error only occurs running full suite, the individual test_logging_exceptions.py set of tests run fine in isolation.

io\test_logging_exceptions.py ..Fatal Python error: Aborted

Current thread 0x00007aa8 (most recent call first):
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\capture.py", line 780 in pytest_runtest_setup
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_callers.py", line 34 in _multicall
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_manager.py", line 80 in _hookexec
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_hooks.py", line 265 in __call__
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 259 in <lambda>
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 338 in from_call
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 258 in call_runtest_hook
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 219 in call_and_report
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 124 in runtestprotocol
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\runner.py", line 111 in pytest_runtest_protocol
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_callers.py", line 39 in _multicall
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_manager.py", line 80 in _hookexec
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_hooks.py", line 265 in __call__
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\main.py", line 347 in pytest_runtestloop
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_callers.py", line 39 in _multicall
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_manager.py", line 80 in _hookexec
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_hooks.py", line 265 in __call__
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\main.py", line 322 in _main
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\main.py", line 268 in wrap_session
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\main.py", line 315 in pytest_cmdline_main
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_callers.py", line 39 in _multicall
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_manager.py", line 80 in _hookexec
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\pluggy\_hooks.py", line 265 in __call__
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\config\__init__.py", line 164 in main
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\lib\site-packages\_pytest\config\__init__.py", line 187 in console_main
  File "C:\Users\Robadob\fgpu2\build2\lib\Release\python\venv\Scripts\py.test.exe\__main__.py", line 7 in <module>
  File "C:\ProgramData\Miniconda3\lib\runpy.py", line 87 in _run_code
  File "C:\ProgramData\Miniconda3\lib\runpy.py", line 197 in _run_module_as_main
Warning: Input file 'test.json' refers to second input file 'invalid', this will not be loaded.
RTC Initialisation Processing time: 0.000000 s
Warning: Input file 'test.json' refers to second input file 'invalid', this will not be loaded.
Step 1 Processing time: 0.000000 s
Warning: Input file 'test.xml' refers to second input file 'invalid', this will not be loaded.
RTC Initialisation Processing time: 0.000000 s
Warning: Input file 'test.xml' refers to second input file 'invalid', this will not be loaded.
Step 1 Processing time: 0.000000 s

(venv) (base) C:\Users\Robadob\fgpu2\tests\swig\python>

@ptheywood
Copy link
Member

I've opened #938 to move the philox 128 bytes optimisation to a future API extension.

@ptheywood
Copy link
Member

I've reproduced the vis pyfalmegpu test suite failure in a release build under linux using current master (a76da8c)

../tests/swig/python/io/test_logging_exceptions.py ...Fatal Python error: Aborted

Will promote that to a separate issue.

Copy link
Member

@ptheywood ptheywood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one highglited issue of the new typedefs being in the global namespace to sort (and potentially a slight rename).

Otherwise c++ tests all pass fine under linux with philox and mrg (assuming xorwow is still fine), and the CMake option works nicely / gives very useful errror message.

Will be a big improvement to the submodel benchmark (when using CUDA <= 11.2, and should still be an iprovement since then).

include/flamegpu/util/detail/curand.cuh Show resolved Hide resolved
String with suitable values MRG, PHILOX and XORWOW (case-insensitive), other values will produce an error at configure time.
…XORWOW)

Previous behaviour can be retained by setting CURAND_ENGINE to XORWOW at CMake configuration time.
@ptheywood ptheywood self-requested a review October 21, 2022 15:35
@mondus mondus merged commit 6c131ef into master Oct 26, 2022
@mondus mondus deleted the curand_switch branch October 26, 2022 14:06
@ptheywood ptheywood mentioned this pull request Oct 26, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cuRAND initialisation time
3 participants