-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zero-checks to axpy-like operations #1573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm in general. Maybe there are still some static_cast
missing to make it compile.
[&beta_val](const type& x) { | ||
return is_zero(beta_val) ? zero(beta_val) : beta_val * x; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor performance comment, you could try switching the lambda and zero check, i.e.
is_zero(beta)
? [&beta_val](const type& x) { return zero(beta); }
: [&beta_val](const type& x) { return beta_val * x; }
But this might not work, since the two branches of the ?: operator have different types. And it might increase compile times, since it might compile the kernel two times
reference/matrix/ell_kernels.cpp
Outdated
arithmetic_type result = c->at(row, j); | ||
result *= beta_val; | ||
arithmetic_type result = | ||
is_zero(beta_val) ? zero(beta_val) : beta_val * c->at(row, j); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_zero(beta_val) ? zero(beta_val) : beta_val * c->at(row, j); | |
is_zero(beta_val) ? zero(beta_val) : beta_val * static_cast<arithmetic_type>(c->at(row, j)); |
to make it compile
Is there any reason to avoid propagation of NaN? If there's no performance penalty, I think propagation of NaN is easier to know the algorithm does not work out due to some arthmetic error. |
@yhmtsai if you are computing for example |
no, 0 * NaN should be NaN not zero, so it is not mathimatical equality by just leaving them out. |
I know current vendor library usually treat 0 as do not touch due to BLAS. |
that makes calculations more fragile, we already do similar things (special cases) for zeros inside our solver kernels |
@yhmtsai we treat |
25975e4
to
f732df9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a mental note for me: the load balanced spmv is also covered, since it used the Dense::scale
function if beta != nullptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is inconsistent in SpMV, which should be the same up to the rounding error IMO
This prevents NaNs from polluting the output
In AXPBY SpMV, it takes (nnz + 3n) ValueType + (nnz + n + 1) IndexType |
The behavior seems well-documented, e.g. with GEMV
This is also not meant as a performance optimization, rather a safety feature. The argument is mainly that this doesn't have a negative performance impact. What sources of NaNs can we imagine? Either they come from uninitialized memory, or they come from divergent operations, any others I am missing? Regardless of the source, in an application, if I multiply something with an explicit zero, I as a programmer am saying that I don't care what these values were. I'm having a hard time imagining inside Ginkgo's solvers where we would end up with a zero scalar that could cause problematic NaNs to disappear. As soon as a NaN appears in a vector, it will infect scalars through reductions and other vectors via updates, I really don't see a way how a scalar could become zero after a NaN was introduced in a vector, which is where this issue would pop up. Can you think of a case? I have the impression that we are discussing something that will not happen in practice. |
Here is the paper Pratik referred to in today's meeting: Proposed Consistent Exception Handling for the BLAS and LAPACK |
We seem to use opposite view to see this problem like user-specific vs numerical error during application. |
Just a note, in some cases users don't have any other option. For example, if you want to store the result of |
That's the reason why I said we should provide |
Operations like$\alpha A x + 0 y$ may propagate NaNs from y to the output despite the 0 coefficient. This can be avoided by checking the beta scaling factors for zero explicitly.
TODO: