-
Notifications
You must be signed in to change notification settings - Fork 224
Conversation
Codecov Report
@@ Coverage Diff @@
## main #56 +/- ##
==========================================
- Coverage 76.81% 76.23% -0.58%
==========================================
Files 229 238 +9
Lines 19617 19966 +349
==========================================
+ Hits 15068 15222 +154
- Misses 4549 4744 +195
Continue to review full report at Codecov.
|
@elferherrera what do you think about this? It is a bit different from DataFusions' current The API would be along the lines of let scalar = max(array);
match scalar.data_type() {
DataType::Float64 => scalar.as_any().downcast_ref<PrimitiveScalar<f64>>(),
DataType::Int64 | DataType::Date64 | DataType::Timestamp(_, _) | ... => scalar.as_any().downcast_ref<PrimitiveScalar<i64>>(),
...
} which is the same idiom that arrow2 uses for arrays, where each physical type has a unique in-memory representation. |
I like this approach, and using a trait instead of an enum offers flexibility with implementing a possible |
@jorgecarleitao I also like this approach. It makes |
@elferherrera , I think that we have the exact same problem as |
@jorgecarleitao Im lost on this one. Do you mean the same issue we had with |
9dd96bb
to
a557bfc
Compare
750befe
to
32f782f
Compare
adeeb29
to
2ca4b16
Compare
ae75f9d
to
ae0e9a9
Compare
In the columnar database system, But I think to take Thus the API could be much more cleaner, and everything in compute kernels is |
Hey, Thanks for the feedback. I understand the idea. numpy uses the same approach with the I see three concerns with a "constant Array": The first is that we are effectively extending the core trait with an out-of-spec encoding: The second is that With scalars, we can write The third concern is that all our dynamic operators use The scalar API proposed here is mostly used to simply consumers' code with an API like The second motivation is to make the implementation of the |
This PR adds support for scalar values, the dimension 0 version of an
Array
. See README with the design choices.It does not enforce any in-memory specification, but their struct alignments are semantically equivalent to arrays.
The rational for this scalar is that it allows some of our compute operators to be written as generic functions. The two most relevant use-cases:
Fn(Array) -> Scalar
Fn(Array, Scalar) -> Array
This PR demonstrates the usefulness on the aggregates.