Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Made IPC ZSTD-compressed consumable by pyarrow #675

Merged
merged 1 commit into from
Dec 12, 2021
Merged

Conversation

jorgecarleitao
Copy link
Owner

I do not understand why the level must be zstd's default to work, but it works. This is either a bug on zstd's crate or on pyarrow's ZSTD's compressor :/

@jorgecarleitao jorgecarleitao added the bug Something isn't working label Dec 11, 2021
@codecov
Copy link

codecov bot commented Dec 11, 2021

Codecov Report

Merging #675 (e50f12b) into main (2b86bc9) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #675   +/-   ##
=======================================
  Coverage   69.60%   69.60%           
=======================================
  Files         301      301           
  Lines       16765    16765           
=======================================
  Hits        11670    11670           
  Misses       5095     5095           
Impacted Files Coverage Δ
src/io/ipc/compression.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b86bc9...e50f12b. Read the comment docs.

@ritchie46
Copy link
Collaborator

That magic number goes somewhere in metadata I assume? Now we aligned with pyarrows 'level'?

@jorgecarleitao
Copy link
Owner Author

Yes, the thing is that the "level" is just a number that tradeoffs performance for compression, so the fact that only one level works is a bit worrisome imo - the ability to read and write ZSTD-compressed data should be independent of the level. This PR just matches the level, but why other numbers do not work is unclear to me.

@ritchie46
Copy link
Collaborator

Hmm.. a bit hard to test whats at fault here. As there are not many ipc implementations. xD

@jorgecarleitao
Copy link
Owner Author

jorgecarleitao commented Dec 12, 2021

exactly! xD The evidence (prior to this PR) is:

  • pyarrow reads what pyarrow writes
  • pyarrow does not read what arrow2 writes
  • arrow2 reads what pyarrow writes
  • arrow2 reads what arrow2 writes

since this only changes the arrow2 writer (and we continue to be able to read it), I am not very worried.

@jorgecarleitao jorgecarleitao merged commit 89921d3 into main Dec 12, 2021
@jorgecarleitao jorgecarleitao deleted the fix_zstd branch December 12, 2021 11:43
@jorgecarleitao jorgecarleitao added enhancement An improvement to an existing feature and removed bug Something isn't working labels Dec 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants