Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ipc reading code into methods on ArrayReader #7006

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jan 21, 2025

Note this PR includes a bunch of whitespace changes that make it look like a much larger change than it is

Viewing the diff by ignoring whitespace makes the structure more clear I think

Which issue does this PR close?

Rationale for this change

@totoroyyb and I are working on adding another flag to the ipc reader code that allows disabling validation during read (see #6938). If the flag is true it will be unsafe.

The current structure of the code as a bunch of free functions makes it impossible to pass the parameter down without having to mark all the inner functions unsafe, which is not correct.

What changes are included in this PR?

This PR refactors the code that currently takes an ArrayReader as a parameter
and makes it a method on the function. This has the nice property that
we don't have to pass require_alignment as a flag anymore and it sets us up very nicely
to be able to add disable_validation in a follow on PR

Are there any user-facing changes?

There are no intended changes in this PR -- all changes are to internal structures only.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 21, 2025
.map(|_| reader.next_buffer())
.collect::<Result<Vec<_>, _>>()?;
create_primitive_array(
impl ArrayReader<'_> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This illustrates the point of this PR: remove the need to pass require_alignment as an argument.

It does this by this code into a method on ArrayReader which means require_alignment is now passed as a field.

require_alignment is not a problem, but skip_validation is, for the reasons @tustvold states on #6938 (comment)

field: &Field,
variadic_counts: &mut VecDeque<i64>,
) -> Result<ArrayRef, ArrowError> {
let reader = self;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aliased self --> reader to minimize the diff. I can remove this rename as a follow on PR

Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few other methods that need to be moved into ArrayReader (e.g. anything that has a require_alignment flag

}

impl<'a> ArrayReader<'a> {
/// Create a reader for decoding arrays from an encoded [`RecordBatch`]
fn try_new(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is refactored code from read_record_batch_impl

metadata: &MetadataVersion,
require_alignment: bool,
) -> Result<RecordBatch, ArrowError> {
let buffers = batch.buffers().ok_or_else(|| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code in this method was split across ArrayReader::try_new and ArrayReader::read_record_batch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant