Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date before 1678 causes panic #4875

Closed
DDtKey opened this issue Jan 11, 2023 · 11 comments
Closed

Date before 1678 causes panic #4875

DDtKey opened this issue Jan 11, 2023 · 11 comments
Labels
bug Something isn't working

Comments

@DDtKey
Copy link
Contributor

DDtKey commented Jan 11, 2023

Describe the bug
Attempt to select date-time < 1678 causes a panic against current main branch:

Error: Arrow error: External error: Arrow error: External error: Join Error
caused by
External error: task 46 panicked

Probably it's more Arrow issue, however Join Error looks weird to me 🤔

To Reproduce
File inventions.csv:

invented_at
1677-06-14T07:29:01.256

SQL:

SELECT d.invented_at as year FROM inventions d ORDER BY d.invented_at

Expected behavior
Should be an error instead of panic

Additional context
Add any other context about the problem here.

@DDtKey DDtKey added the bug Something isn't working label Jan 11, 2023
@DDtKey DDtKey changed the title Date before 1678 causes panic Regression: date before 1678 causes panic against Jan 11, 2023
@DDtKey DDtKey mentioned this issue Jan 11, 2023
4 tasks
@DDtKey DDtKey changed the title Regression: date before 1678 causes panic against Regression: date before 1678 causes panic Jan 11, 2023
@alamb
Copy link
Contributor

alamb commented Jan 11, 2023

I tried to run this query on datafusion cli from the 15.0.0 tag

cd datafusion-cli
cargo run
DataFusion CLI v15.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp;
thread 'main' panicked at 'attempt to multiply with overflow', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion/datafusion-cli$ 
(arrow_dev) alamb@MacBook-Pro-8:~/Software/arrow-datafusion/datafusion-cli$ CARGO_TARGET_DIR=/Users/alamb/Software/target-df cargo run -- 
    Finished dev [unoptimized + debuginfo] target(s) in 0.32s
     Running `/Users/alamb/Software/target-df/debug/datafusion-cli`
DataFusion CLI v15.0.0
❯ select '1677-06-14T07:29:01.256'::date;
ArrowError(CastError("Cannot cast string '1677-06-14T07:29:01.256' to value of Date32 type"))

However, it seems to work correctly on datafusion 16

DataFusion CLI v16.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp
;
+---------------------------------+
| Utf8("1677-06-14T07:29:01.256") |
+---------------------------------+
| 2262-01-03T07:03:34.965551616   |
+---------------------------------+
1 row in set. Query took 0.032 seconds.
❯ select '1677-06-14T07:29:01.256'::date;
ArrowError(CastError("Cannot cast string '1677-06-14T07:29:01.256' to value of Date32 type"))

@DDtKey I wonder if you can share more of the data / query / query plan?

@alamb
Copy link
Contributor

alamb commented Jan 11, 2023

Error: Arrow error: External error: Arrow error: External error: Join Error

Probably it's more Arrow issue, however Join Error looks weird to me 🤔

I think this means that one of the tasks running the plan panic'd

perhaps you can run your test case using setting RUST_BACKTRACE=1 and it might dump out the whole stack frame for the planic?

@DDtKey
Copy link
Contributor Author

DDtKey commented Jan 11, 2023

Utf8("1677-06-14T07:29:01.256") |
+---------------------------------+
| 2262-01-03T07:03:34.965551616

@alamb But in your example 2262 year doesn't look correctly 🤔 It's probably even worse, silent error.

Well, I test it against 39d98f8f4528f408c3cc8a03ee1fe7ecd990a35f hash (master branch) with the following code:

let path = "my/path/test.csv";
let ctx = SessionContext::new();
ctx.register_csv("inventions", path, CsvReadOptions::default())
        .await?;
        
let data_frame = ctx
        .sql("SELECT d.invented_at as year FROM inventions d ORDER BY d.invented_at")
        .await?;

data_frame.show().await?;
And here log with `RUST_BACKTRACE` is (spoiler)
thread 'tokio-runtime-worker' panicked at 'attempt to multiply with overflow', /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
stack backtrace:
   0: rust_begin_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:575:5
   1: core::panicking::panic_fmt
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:65:14
   2: core::panicking::panic
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panicking.rs:115:5
   3: chrono::naive::datetime::NaiveDateTime::timestamp_nanos
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
   4: arrow_cast::parse::string_to_timestamp_nanos
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-cast-29.0.0/src/parse.rs:101:19
   5: <arrow_array::types::TimestampNanosecondType as arrow_cast::parse::Parser>::parse
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-cast-29.0.0/src/parse.rs:268:9
   6: arrow_csv::reader::parse_item
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:706:5
   7: arrow_csv::reader::build_primitive_array::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:896:30
   8: core::iter::adapters::map::map_try_fold::{{closure}}
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/map.rs:91:28
   9: <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold::enumerate::{{closure}}
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/enumerate.rs:85:27
  10: core::iter::traits::iterator::Iterator::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:2238:21
  11: <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/enumerate.rs:91:9
  12: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/map.rs:117:9
  13: <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:195:9
  14: core::iter::traits::iterator::Iterator::try_for_each
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:2299:9
  15: <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:178:9
  16: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/map.rs:103:9
  17: <arrow_buffer::buffer::immutable::Buffer as core::iter::traits::collect::FromIterator<T>>::from_iter
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-buffer-29.0.0/src/buffer/immutable.rs:338:32
  18: core::iter::traits::iterator::Iterator::collect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:1836:9
  19: <arrow_array::array::primitive_array::PrimitiveArray<T> as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-array-29.0.0/src/array/primitive_array.rs:881:30
  20: core::iter::traits::iterator::Iterator::collect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:1836:9
  21: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter::{{closure}}
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:2075:49
  22: core::iter::adapters::try_process
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:164:17
  23: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:2075:9
  24: core::iter::traits::iterator::Iterator::collect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:1836:9
  25: arrow_csv::reader::build_primitive_array
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:885:5
  26: arrow_csv::reader::parse::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:619:21
  27: core::iter::adapters::map::map_try_fold::{{closure}}
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/map.rs:91:28
  28: core::iter::traits::iterator::Iterator::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:2238:21
  29: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/map.rs:117:9
  30: <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:195:9
  31: core::iter::traits::iterator::Iterator::try_for_each
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:2299:9
  32: <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::next
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:178:9
  33: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/vec/spec_from_iter_nested.rs:26:32
  34: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/vec/spec_from_iter.rs:33:9
  35: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/alloc/src/vec/mod.rs:2757:9
  36: core::iter::traits::iterator::Iterator::collect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:1836:9
  37: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter::{{closure}}
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:2075:49
  38: core::iter::adapters::try_process
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/adapters/mod.rs:164:17
  39: <core::result::Result<V,E> as core::iter::traits::collect::FromIterator<core::result::Result<A,E>>>::from_iter
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/result.rs:2075:9
  40: core::iter::traits::iterator::Iterator::collect
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/iter/traits/iterator.rs:1836:9
  41: arrow_csv::reader::parse
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:543:44
  42: <arrow_csv::reader::Reader<R> as core::iter::traits::iterator::Iterator>::next
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/arrow-csv-29.0.0/src/reader.rs:513:22
  43: <futures_util::stream::iter::Iter<I> as futures_core::stream::Stream>::poll_next
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.25/src/stream/iter.rs:43:21
  44: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-core-0.3.25/src/stream.rs:120:9
  45: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.25/src/stream/stream/mod.rs:1626:9
  46: datafusion::physical_plan::file_format::file_stream::FileStream<F>::poll_inner
             at /Users/ddtkey/.cargo/git/checkouts/arrow-datafusion-71ae82d9dec9a01c/39d98f8/datafusion/core/src/physical_plan/file_format/file_stream.rs:248:35
  47: <datafusion::physical_plan::file_format::file_stream::FileStream<F> as futures_core::stream::Stream>::poll_next
             at /Users/ddtkey/.cargo/git/checkouts/arrow-datafusion-71ae82d9dec9a01c/39d98f8/datafusion/core/src/physical_plan/file_format/file_stream.rs:295:22
  48: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-core-0.3.25/src/stream.rs:120:9
  49: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.25/src/stream/stream/mod.rs:1626:9
  50: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.25/src/stream/stream/next.rs:32:9
  51: datafusion::physical_plan::repartition::RepartitionExec::pull_from_input::{{closure}}
             at /Users/ddtkey/.cargo/git/checkouts/arrow-datafusion-71ae82d9dec9a01c/39d98f8/datafusion/core/src/physical_plan/repartition.rs:481:39
  52: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/future/mod.rs:91:19
  53: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/core.rs:223:17
  54: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/loom/std/unsafe_cell.rs:14:9
  55: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/core.rs:212:13
  56: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:476:19
  57: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panic/unwind_safe.rs:271:9
  58: std::panicking::try::do_call
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:483:40
  59: ___rust_try
  60: std::panicking::try
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:447:19
  61: std::panic::catch_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panic.rs:137:14
  62: tokio::runtime::task::harness::poll_future
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:464:18
  63: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:198:27
  64: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:152:15
  65: tokio::runtime::task::raw::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/raw.rs:255:5
  66: tokio::runtime::task::raw::RawTask::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/raw.rs:200:18
  67: tokio::runtime::task::LocalNotified<S>::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/mod.rs:394:9
  68: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:464:13
  69: tokio::runtime::coop::with_budget
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/coop.rs:102:5
  70: tokio::runtime::coop::budget
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/coop.rs:68:5
  71: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:463:9
  72: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:426:24
  73: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:406:17
  74: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/macros/scoped_tls.rs:61:9
  75: tokio::runtime::scheduler::multi_thread::worker::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:403:5
  76: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/scheduler/multi_thread/worker.rs:365:45
  77: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/blocking/task.rs:42:21
  78: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/core.rs:223:17
  79: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/loom/std/unsafe_cell.rs:14:9
  80: tokio::runtime::task::core::Core<T,S>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/core.rs:212:13
  81: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:476:19
  82: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/core/src/panic/unwind_safe.rs:271:9
  83: std::panicking::try::do_call
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:483:40
  84: ___rust_try
  85: std::panicking::try
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panicking.rs:447:19
  86: std::panic::catch_unwind
             at /rustc/69f9c33d71c871fc16ac445211281c6e7a340943/library/std/src/panic.rs:137:14
  87: tokio::runtime::task::harness::poll_future
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:464:18
  88: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:198:27
  89: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/harness.rs:152:15
  90: tokio::runtime::task::raw::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/raw.rs:255:5
  91: tokio::runtime::task::raw::RawTask::poll
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/raw.rs:200:18
  92: tokio::runtime::task::UnownedTask<S>::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/task/mod.rs:431:9
  93: tokio::runtime::blocking::pool::Task::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/blocking/pool.rs:159:9
  94: tokio::runtime::blocking::pool::Inner::run
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/blocking/pool.rs:511:17
  95: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /Users/ddtkey/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.24.1/src/runtime/blocking/pool.rs:469:13
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Error: Arrow error: External error: Arrow error: External error: Join Error
caused by
External error: task 46 panicked

So it panics on overflow: attempt to multiply with overflow

And it works with datafusion = 15.0.0 for me:

+-------------------------+
| year                    |
+-------------------------+
| 1677-06-14T07:29:01.256 |
+-------------------------+

@alamb
Copy link
Contributor

alamb commented Jan 11, 2023

@alamb But in your example 2262 year doesn't look correctly 🤔 It's probably even worse, silent error.

I bet you are right that the calculation sliently overflows, but only on release builds

I previously tested using a release build (where the overflow is not checked) for 16 but a debug build for 15.

When I tested both on debug builds:

# on master at 315f5b9848d3a1218ded9a1113ac4e8e8a90272f
$ cargo run
     Running `/Users/alamb/Software/target-df2/debug/datafusion-cli`
DataFusion CLI v16.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp;
thread 'main' panicked at 'attempt to multiply with overflow', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Then I checked out 15.0.0:

$ git checkout  15.0.0
$ cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `/Users/alamb/Software/target-df2/debug/datafusion-cli`
DataFusion CLI v15.0.0
❯ select '1677-06-14T07:29:01.256'::timestamp
;
thread 'main' panicked at 'attempt to multiply with overflow', /Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/chrono-0.4.23/src/naive/datetime/mod.rs:426:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

So my conclusion is:

  1. This is definitely a bug
  2. The same problem existed in 15.0.0 (so it is not a regression)
  3. The issue is upstream in arrow -- i will file a ticket shortly

@alamb
Copy link
Contributor

alamb commented Jan 11, 2023

Upstream issue: apache/arrow-rs#3512

@DDtKey
Copy link
Contributor Author

DDtKey commented Jan 11, 2023

The same problem existed in 15.0.0 (so it is not a regression)

I can't reproduce it even with --release flag. It works totally fine with version from crates.io (datafusion = "15.0.0") and the code provided above.

cargo run --release

+-------------------------+
| year                    |
+-------------------------+
| 1677-06-14T07:29:01.256 |
+-------------------------+

There is no overflow at all, year is correct and works for either release or debug mode 🤔

Is there any difference between your git checkout 15.0.0 and version published on crates?

@comphead
Copy link
Contributor

@DDtKey please check the comment apache/arrow-rs#3512 (comment)

@alamb
Copy link
Contributor

alamb commented Jan 12, 2023

I can't reproduce it even with --release flag. It works totally fine with version from crates.io (datafusion = "15.0.0") and the code provided above.

🤔 Maybe it is because you are in a different timezone than I was trying it in

@DDtKey
Copy link
Contributor Author

DDtKey commented Jan 12, 2023

I can't reproduce it even with --release flag. It works totally fine with version from crates.io (datafusion = "15.0.0") and the code provided above.

🤔 Maybe it is because you are in a different timezone than I was trying it in

Well, the reason in type inference actually, with explicit schema I was able to repro the panic on 15.0.0.
Sorry for misleading info

So yeah, it isn't regression and I'll change title

@DDtKey DDtKey changed the title Regression: date before 1678 causes panic Date before 1678 causes panic Jan 12, 2023
@alamb
Copy link
Contributor

alamb commented Jan 12, 2023

Thank you @DDtKey

@alamb
Copy link
Contributor

alamb commented Feb 2, 2023

I believe this has been fixed now:

DataFusion CLI v17.0.0
❯  select '1677-06-14T07:29:01.256'::timestamp;
ArrowError(ParseError("The dates that can be represented as nanoseconds have to be between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.854775804"))

Please let me know if you disagree

@alamb alamb closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants