-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Date before 1678
causes panic
#4875
Comments
1678
causes panic 1678
causes panic against
1678
causes panic against 1678
causes panic
I tried to run this query on datafusion cli from the cd datafusion-cli
cargo run
However, it seems to work correctly on datafusion 16 DataFusion CLI v16.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp
;
+---------------------------------+
| Utf8("1677-06-14T07:29:01.256") |
+---------------------------------+
| 2262-01-03T07:03:34.965551616 |
+---------------------------------+
1 row in set. Query took 0.032 seconds.
❯ select '1677-06-14T07:29:01.256'::date;
ArrowError(CastError("Cannot cast string '1677-06-14T07:29:01.256' to value of Date32 type")) @DDtKey I wonder if you can share more of the data / query / query plan? |
I think this means that one of the tasks running the plan panic'd perhaps you can run your test case using setting |
@alamb But in your example Well, I test it against let path = "my/path/test.csv";
let ctx = SessionContext::new();
ctx.register_csv("inventions", path, CsvReadOptions::default())
.await?;
let data_frame = ctx
.sql("SELECT d.invented_at as year FROM inventions d ORDER BY d.invented_at")
.await?;
data_frame.show().await?; And here log with `RUST_BACKTRACE` is (spoiler)
So it panics on overflow: And it works with
|
I bet you are right that the calculation sliently overflows, but only on release builds I previously tested using a release build (where the overflow is not checked) for 16 but a debug build for 15. When I tested both on debug builds: # on master at 315f5b9848d3a1218ded9a1113ac4e8e8a90272f
$ cargo run
Running `/Users/alamb/Software/target-df2/debug/datafusion-cli`
DataFusion CLI v16.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp;
thread 'main' panicked at 'attempt to multiply with overflow', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace Then I checked out 15.0.0: $ git checkout 15.0.0
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.28s
Running `/Users/alamb/Software/target-df2/debug/datafusion-cli`
DataFusion CLI v15.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp
;
thread 'main' panicked at 'attempt to multiply with overflow', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace So my conclusion is:
|
Upstream issue: apache/arrow-rs#3512 |
I can't reproduce it even with cargo run --release
+-------------------------+
| year |
+-------------------------+
| 1677-06-14T07:29:01.256 |
+-------------------------+ There is no overflow at all, year is correct and works for either release or debug mode 🤔 Is there any difference between your |
@DDtKey please check the comment apache/arrow-rs#3512 (comment) |
🤔 Maybe it is because you are in a different timezone than I was trying it in |
Well, the reason in type inference actually, with explicit schema I was able to repro the panic on So yeah, it isn't regression and I'll change title |
1678
causes panic 1678
causes panic
Thank you @DDtKey |
I believe this has been fixed now:
Please let me know if you disagree |
Describe the bug
Attempt to select date-time
< 1678
causes a panic against current main branch:Probably it's more
Arrow
issue, howeverJoin Error
looks weird to me 🤔To Reproduce
File
inventions.csv
:SQL:
Expected behavior
Should be an error instead of panic
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: