-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve into_shape() limitations #390
Comments
The drawback 2 seems critically bad, but at the same time, it's in the interest of not biasing too much in favour of the default c order. |
The issue is that in a large program, it can be difficult to determine the memory layout of an array by just reading the source code, so the result of I really like the way NumPy's The challenge with directly applying NumPy's enum TmpViewRepr<'a, A> {
View(ViewRepr<&'a A>),
Temporary(OwnedRepr<A>),
}
impl<'a, A> Data for TmpViewRepr<'a, A> {
// ...
} which can represent a read-only view of the data or a temporary copy. (This could also be called enum TmpViewMutRepr<'a, A> {
View(ViewRepr<&'a mut A>),
Temporary {
tmp: OwnedRepr<A>,
drop_hook: Box<FnMut(&mut Self)>
},
}
impl<'a, A> Data for TmpViewMutRepr<'a, A> {
// ...
}
impl<'a, A> DataMut for TmpViewMutRepr<'a, A> {
// ...
} which can represent a mutable view of the data or a temporary copy. When this type gets dropped, if it's the These types are useful not only for improving What do you think? Edit: Yet another place where |
Just a brief note, I'm sorry; I've been swamped this week. RcArray is much more like numpy's array :-P But in the end we went for the tempting Rust model of Array vs views. So a solution RcArray could also be an option; note that your "Temp" should probably be called Cow. |
Now that #632 is merged, we can add an I think we should require a logical order parameter (similar to NumPy's // row-major order
arr.as_shape((3, 4), Order::RowMajor)?
// column-major order
arr.as_shape((3, 4), Order::ColMajor)? and the API would be /// This is analogous to the `order` parameter in
/// [`numpy.reshape()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html#numpy.reshape).
pub enum Order {
/// C-like order
RowMajor,
/// Fortran-like order
ColMajor,
#[doc(hidden)]
__NonExhaustive,
}
impl<A, S, D> ArrayBase<S, D>
where
S: RawData<Elem = A>,
D: Dimension,
{
pub fn into_shape<E>(self, shape: E, order: Order) -> Result<ArrayBase<S, E::Dim>, ShapeError>
where
S: DataOwned,
E: IntoDimension,
{
}
pub fn as_shape<E>(&self, shape: E, order: Order) -> Result<CowArray<'_, A, E::Dim>, ShapeError>
where
S: Data,
E: IntoDimension,
{
}
} We could add other variants to Edit: Earlier, I included a second approach, but I later realized that it didn't actually work, so I've removed it from this comment. Thoughts? |
Yes, implementing @jturner314's plan here is important. We shouldn't just add a new method but need to rethink how to handle all of The |
I think an (Edit - is it possible to find a way for |
Regarding the name of Regarding I think I'd prefer to have two separate methods. // Clones elements if necessary.
fn into_shape<E>(self, shape: E, order: Order) -> Result<ArrayBase<S, E::Dim>, ShapeError>
where
A: Clone,
S: DataOwned,
E: IntoDimension,
{}
// Errors if cloning would be necessary.
fn reshape<E>(self, shape: E, order: Order) -> Result<ArrayBase<S, E::Dim>, ShapeError>
where
E: IntoDimension,
{} (Or the names could be swapped. I'm not sure which method is better to call which name.) Calls to the existing |
By implementing a by-value iterator for Array and inserting some "reflection" methods into I'm also looking at builder-type solutions because I think having no default value for the order is very noisy. Having a builder would make it possible to have RowMajor be the default value. A (less than?) nice trick is that the builder can have a method a.into_shape(x).unwrap(); // unwrap the builder to an array - default order is RowMajor
a.into_shape(y).order(Order::ColumnMajor).unwrap(); // override default order
a.into_shape(z).result(); // Get the Result<ArrayBase<_>, _> value |
I actually disagree with this. For most methods (e.g. the constructors) it makes sense to have a default layout/order because the layout doesn't affect the visible results (i.e. the element values). For reshaping though, the order does affect the visible results, so I think it's worthwhile to require an order parameter. Requiring it also encourages the user think carefully about what the reshaping means. (Reshaping can be a confusing operation.) I'm not strongly opposed to making it optional, but IMO the better design is to require it. |
Might be a bit irrelevant for the thread, but I didn't think this warranted it's own issue. With regards to limitation 3, I believe you can't "infer" the correct shape because dimensions are of type a.into_shape(a.raw_dim().reshape(1,2,3,-1)) Would this be an addition that would be approved in the ndarray api? Edit: I guess you are 50* of the way there with something like: a.into_shape(a.raw_dim().reshape(1,2,3,a.raw_dim().as_array_view().product()/(1*2*3))) ` |
and let new_shape = (1, 2, 3, a.len()/(1 * 2 * 3));
a.into_shape(new_shape) Edit: There are some in-band and quite wonky alternatives for a placeholder - instead of
That would look like this: a.into_shape((1, 2, 0, 3)).unwrap() // reshape to 1 x 2 x rest x 3
a.into_shape((1, 2, usize::MAX, 3)).unwrap() // reshape to 1 x 2 x rest x 3 |
@jturner314 It affects the visible results, but it is just a logical order default - and we have that same default across a lot of API, so it would not be out of place here. I would be happy to just have the parameter for an order - that makes the implementation job easier in our crate, less work and less complexity with a builder. But I think having a default and allowing it to be overridden is the best for the common case and is easier to use, and will cause less breakage for users. |
Instead of a builder, it's possible the argument should just be builder like instead. Just like it's done for constructors in fact - a shape and an optional order parameter. Potentially - but it would be messy compared with existing methods - using traits for "overloading" in a different way and accepting both |
the problem was the use of reversed_axes and into_shape as mentioned in rust-ndarray/ndarray#390
Thank you to everyone working on shape transformations! Here is an implementation of
The full code is here: Implemetation of `try_to_shape_mut`The unit tests below were written almost entirely by Copilot, which impressed me./// Helper trait for reshaping `ndarray` arrays mutably. See [issue #390][] for
/// more details on upstream support for this.
///
/// [issue #390]: /~https://github.com/rust-ndarray/ndarray/issues/390
pub trait TryToShapeMut<'a, Elem> {
/// Attempt to reshape a mutable `ndarray` array. This is basically
/// the same as `ArrayBase::to_shape` but for mutable arrays. It will fail
/// if the array is not contiguous in memory.
fn try_to_shape_mut<Shape>(
&'a mut self,
new_shape: Shape,
) -> Result<ArrayViewMut<'a, Elem, Shape::Dim>, ShapeError>
where
Shape: ShapeArg + Into<StrideShape<Shape::Dim>>;
}
impl<'a, Elem, A, D> TryToShapeMut<'a, Elem> for ArrayBase<A, D>
where
A: DataMut<Elem = Elem>,
D: Dimension,
{
fn try_to_shape_mut<Shape>(
&'a mut self,
new_shape: Shape,
) -> Result<ArrayViewMut<'a, Elem, Shape::Dim>, ShapeError>
where
Shape: ShapeArg + Into<StrideShape<Shape::Dim>>,
{
let slice_mut = self
.as_slice_mut()
.ok_or_else(|| ShapeError::from_kind(ErrorKind::IncompatibleLayout))?;
ArrayViewMut::from_shape(new_shape.into(), slice_mut)
}
}
#[cfg(test)]
mod tests {
use super::*;
use ndarray::{arr1, arr2, arr3};
#[test]
fn test_1d() {
let mut a = arr1(&[1, 2, 3, 4, 5, 6]);
let b = a.try_to_shape_mut((3, 2)).unwrap();
assert_eq!(b, arr2(&[[1, 2], [3, 4], [5, 6]]));
}
#[test]
fn test_2d() {
let mut a = arr2(&[[1, 2, 3], [4, 5, 6]]);
let b = a.try_to_shape_mut((3, 2)).unwrap();
assert_eq!(b, arr2(&[[1, 2], [3, 4], [5, 6]]));
}
#[test]
fn test_3d() {
let mut a = arr3(&[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]);
let b = a.try_to_shape_mut((4, 2)).unwrap();
assert_eq!(b, arr2(&[[1, 2], [3, 4], [5, 6], [7, 8]]));
}
#[test]
fn test_3d_fail() {
let mut a = arr3(&[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]);
let b = a.try_to_shape_mut((4, 3));
assert!(b.is_err());
}
#[test]
fn reshape_view_mut() {
let mut a = arr2(&[[1, 2, 3], [4, 5, 6]]);
let mut v = a.view_mut();
let b = v.try_to_shape_mut((3, 2)).unwrap();
assert_eq!(b, arr2(&[[1, 2], [3, 4], [5, 6]]));
}
} |
Fixed by #1310 |
The
.into_shape()
method works in some cases, but compared to NumPy'sreshape()
has the following limitations:.into_shape()
on an array with Fortran memory layout will give a different result from calling.into_shape()
on the logically equivalent array with C memory layout.)-1
to indicate that an axis length should be inferred.)I'd like to add a method or methods to resolve at least limitations 1 and 2.
In order to handle arrays without contiguous memory layout, the data needs to be cloned in some cases. If the input array owns its data, this is simple – just return a new owned array. However, if the input array is an
ArrayView
, then you need to return something likeCow
. If the input array is anArrayViewMut
, you need some way to return a mutable view with the correct shape but make sure that any changes to that view are copied back to the original underlying data.A prototype of one approach is at jturner314/ndarray@0157df8. This introduces a few new types, including
ArrayViewTemp
which handles the temporary array if the input is anArrayView
, andArrayViewMutTemp
, which handles the case where the input is anArrayViewMut
(usingDrop
to make sure the data is written from the temporary array back to the original underlying data).It's worth noting that this type of problem (needing a temporary array in some cases) also occurs for operations other than reshaping. For example, I'm working on a similar approach for
ndarray-linalg
to convert arrays/views to Fortran layout to pass to LAPACK. If something likeArrayViewTemp
andArrayViewMutTemp
get added tondarray
, I can reuse those types inndarray-linalg
for the changing-layout problem.What do you think?
Edit: After thinking about this a little more, I realized:
ArrayViewTemp
andArrayViewMutTemp
enums, it would be cleaner to create new reprs for theS
type parameter inArrayBase
that implementData
/DataMut
.drop_hook
inArrayViewMutTempRepr
to own the view of the original data so that theview
field andDo
type parameter could be removed.The text was updated successfully, but these errors were encountered: