-
Notifications
You must be signed in to change notification settings - Fork 224
crash in parquet read #459
Comments
Application crashes on some parquet files. I use polars, from stacktrace it makes sense to discuss it here, seems. |
pyspark and pandas read that parquet file wo noticable problems |
A minimal working example would be really valuable here. Have you got a code snippet and the parquet file? |
` fn read_parquet(fname: &str) -> std::result::Result<DataFrame, PolarsError> { fn main() {
} |
@ritchie46 here is a link to the file https://drive.google.com/file/d/1pNxrGcErwKCx3wG8Nb-Yns8faHis_BW_/view?usp=sharing |
Thanks! I am taking a look |
Found root cause and fixed it upstream: jorgecarleitao/parquet2#53 . Will try to release a patch soon so that we can benefit from it here and polars. |
I am closing this, as its root was in |
For reference, to check, I used
(column 11, row group 2) |
15: 0x5595b6619e4d - core::panicking::panic::h344f23ad26057b48
at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/panicking.rs:50:5
16: 0x5595b6e21f57 - parquet2::encoding::hybrid_rle::read_next::heffbce2cbf271355
17: 0x5595b6e22002 - parquet2::encoding::hybrid_rle::HybridRleDecoder::new::h4bed53f1cf574f45
18: 0x5595b66bc788 - arrow2::io::parquet::read::binary::basic::iter_to_array::h904f7ce8e6ca0d40
19: 0x5595b667374a - arrow2::io::parquet::read::page_iter_to_array::hf6ec1a66c97269d8
20: 0x5595b667ebc4 - <core::iter::adapters::enumerate::Enumerate as core::iter::traits::iterator::Iterator>::try_fold::hc0a8230807788873
21: 0x5595b667f320 - <arrow2::io::parquet::read::record_batch::RecordReader as core::iter::traits::iterator::Iterator>::next::hde2d9e82a688b0f8
22: 0x5595b666f160 - polars_io::finish_reader::h3777ac1c190f89b8
23: 0x5595b667dc93 - <polars_io::parquet::ParquetReader as polars_io::SerReader>::finish::hc1e08eb39eb006fc
24: 0x5595b6669511 - core::ops::function::impls::<impl core::ops::function::FnMut for &F>::call_mut::h9969795e932f825a
25: 0x5595b6681e6e - rayon::iter::plumbing::Folder::consume_iter::h92fcd41992ba09a0
26: 0x5595b6685edf - rayon::iter::plumbing::bridge_producer_consumer::helper::h8ef2ea33830b94cf
27: 0x5595b666937f - std::panicking::try::hc5f2657442e028eb
28: 0x5595b66a710d - rayon_core::registry::in_worker::ha0a928d462944c85
29: 0x5595b6685fde - rayon::iter::plumbing::bridge_producer_consumer::helper::h8ef2ea33830b94cf
30: 0x5595b666937f - std::panicking::try::hc5f2657442e028eb
31: 0x5595b66a710d - rayon_core::registry::in_worker::ha0a928d462944c85
32: 0x5595b6685fde - rayon::iter::plumbing::bridge_producer_consumer::helper::h8ef2ea33830b94cf
33: 0x5595b667d91b - <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute::hff5168451b9879a8
34: 0x5595b6611821 - rayon_core::registry::WorkerThread::wait_until_cold::he518e83f6505630f
35: 0x5595b6bbafbd - rayon_core::registry::ThreadBuilder::run::h5d56f209b153a9bd
36: 0x5595b6bbc8e5 - std::sys_common::backtrace::__rust_begin_short_backtrace::h69cdd24543799db2
37: 0x5595b6bb8a9b - core::ops::function::FnOnce::call_once{{vtable.shim}}::hd2e6efae7943bf64
38: 0x5595b6fd5207 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once::h6bff7798948b1075
at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/alloc/src/boxed.rs:1572:9
39: 0x5595b6fd5207 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once::hc2d25ac38f6b2342
at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/alloc/src/boxed.rs:1572:9
40: 0x5595b6fd5207 - std::sys::unix::thread::Thread::new::thread_start::hbba5bc368baac205
at /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/std/src/sys/unix/thread.rs:74:17
41: 0x7f48289bb609 - start_thread
The text was updated successfully, but these errors were encountered: