Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Created by
brew bump
Created with
brew bump-formula-pr
.release notes
pro
: new command to allow qsv to interact with the qsv pro API to tap into qsv pro exclusive features.lens
: new command to interactively view CSVs using the csvlens crate.diff
command is now easier to use with its--drop-equal-fields
option. @janriemer continues to work on hiscsv-diff
crate, and there's morediff
UX improvements coming soon!stats
addssum_length
andavg_length
"streaming" statistics in addition to the existingmin_length
andmax_length
metrics. These are especially useful for datasets with a lot of "free text" columns.stats
also got "smarter" and "faster" by dog-fooding its own statistics to make it run faster!It's a little complicated, but the way
stats
works is that it compiles the "streaming" statistics on the fly first, and the more expensive advanced statistics are "lazily" computed at the end.Since we now compile "sort order" in a streaming manner, we use this info when deriving cardinality at the end to see if we can skip sorting - an otherwise necessary step to get cardinality which is done by "scanning" all the sorted values of a column. Everytime two neighboring values differ in a sorted column, it increments the cardinality count.
Apart from this "sort order" optimization, we also improved the "cardinality scan" algorithm - halving its memory footprint and making it faster still for larger datasets by parallelizing the computation!
This in turn, makes the
frequency
command faster and more memory efficient!csv
crate, featuring SIMD-accelerated UTF-8 validation and other minor perf tweaks, making the entire qsv suite faster still!Added
pro
: addqsv pro
command to interact with qsv pro API by @rzmk inpro
: addqsv pro
command to interact with qsv pro API dathere/qsv#2039lens
: new command to interactively view CSVs using the csvlens cratelens
: new command to interactively view CSVs using the csvlens crate dathere/qsv#2117apply
: add crc32 operationapply
: add crc32 operation dathere/qsv#2121count
: add --delimiter optioncount
: add --delimiter option dathere/qsv#2120diff
: add flag--drop-equal-fields
by @janriemer indiff
: add flag--drop-equal-fields
dathere/qsv#2114stats
: addsum_length
andavg_length
columnsstats
: addsum_length
andavg_length
columns dathere/qsv#2113stats
: smarter cardinality computation - added new parallel algorithm for large datasets (10,000+ rows) and updated sequential algorithm for smaller datasets dathere/qsv@4e63fecChanged
count
: added comment to justify magic number dathere/qsv@5241e39stats
: use simdjson for faster JSONL parsing; micro-optimizecompute
hot loop dathere/qsv@0e8b734stats
: standardized OVERFLOW and UNDERFLOW messages dathere/qsv@38c6128sort
: renamed symbol so eliminate devskim lint false positive warning dathere/qsv@12db739lens
feature in GH workflows enablelens
feature in GH workflows dathere/qsv#2122deps
: bump polars 0.42.0 to latest upstream at time of release dathere/qsv@3c17ed1deps
: use our own optimized fork of csv crate, with simdutf8 validation and other minor perf tweaks dathere/qsv@e4bcd71Fixed
schema
: Print an error if theqsv stats
invocation fails by @abrauchli inschema
: Print an error if theqsv stats
invocation fails dathere/qsv#2110New Contributors
schema
: Print an error if theqsv stats
invocation fails dathere/qsv#2110Full Changelog: dathere/qsv@0.133.1...0.134.0