Different Convergence Behavior Between GPU and CPU Runs #520
Unanswered
tan-nguyenxuan
asked this question in
Q&A
Replies: 1 comment 2 replies
-
Hello @tan-nguyenxuan, Thank you for contacting us! The behaviour you mentioned is unexpected. To identify the root cause of this issue, please share the details of the GPU you are using and the version of TensorFlow installed in your environment. Please also check our documentation on Debugging MCMC Convergence issues which lists some debugging steps that may help resolve this problem. Please share the requested information and feel free to reach out if you need any further assistance. Thank you Google Meridian Support Team |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I tried running Meridian with a dataset consisting of 177 weeks, 12 channels, 1 controls. However, I encountered an issue. When running with the following parameters as per the guidelines:
%%time
mmm.sample_prior(500)
mmm.sample_posterior(n_chains=7, n_adapt=500, n_burnin=500, n_keep=1000)
I found that running Meridian on a GPU did not converge, whereas running on a CPU resulted in convergence. The parameter settings in both cases were exactly the same.
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()
I suspect this issue occurs because the initialization of the first parameter set for the MCMC algorithm differs between GPU and CPU runs.
I would appreciate an explanation for this behavior. Can I trust the results from the converged model in the CPU run?
Thanks in advance for your help!
Beta Was this translation helpful? Give feedback.
All reactions