Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating get_row functions to return MatrixSlice #99

Merged
merged 13 commits into from
Dec 11, 2016
Merged

Conversation

AtheMathmo
Copy link
Owner

@AtheMathmo AtheMathmo commented Dec 2, 2016

This is a very incomplete PR. I'm just putting in the request so I can track outstanding work more easily. Still to do:

  • Add column equivalents
  • Update row iterator
  • Rename functions to remove get_ prefix. (Does this make the unchecked functions look weird?)
  • Explore custom types for Row and Col.
  • Benchmarks to compare performance
  • Basic evaluation of vectorization issues. (Do we need to keep casting to contiguous slices?)
  • Indexing for Row and Column types

There is some discussion in issue #87 . There are other ideas here which I would like to explore if I have time.

With this PR I'll try to work towards an agreeable API for these changes.

Also resolved #85 with this PR.

@AtheMathmo
Copy link
Owner Author

A reasonable vectorization test would be:

// With `iter_rows_mut` returning `&mut [T]`
for row in mat.iter_rows_mut() {
    for v in row {
       *v += 2;
   }
}

// Versus

for row in mat.iter_rows_mut() {
    row += 2;
}

@AtheMathmo
Copy link
Owner Author

This PR is now ready for some initial reviews: @Andlon @andrewcsmith if either of you have time I would appreciate it!

I will update the top post above with more information soon. The gist is that I think I've settled on the API, but we still need some more quality of life changes. Some notable ones:

  • IntoIterator for Row and Column
  • Index for Row, etc.
  • Doc improvements
  • Some code reorganization, maybe another PR.
  • Performance specialisations of BaseMatrix for Row and Column. Maybe another PR.

Those are the biggest right now (I think). I'll try to write a complete list up later.

@@ -1,12 +1,11 @@
use std::iter::{ExactSizeIterator, FromIterator};
use std::mem;
use std::slice;
//use std::slice;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before final merge: Remove this?

@@ -89,7 +86,10 @@ impl<'a, T> Iterator for $rows<'a, T> {
unsafe {
// Get pointer and create a slice from raw parts
let ptr = self.slice_start.offset(self.row_pos as isize * self.row_stride);
row = slice::$slice_from_parts(ptr, self.slice_cols);
//row = slice::$slice_from_parts(ptr, self.slice_cols);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before final merge: remove this?

@@ -117,7 +120,10 @@ impl<'a, T> Iterator for $rows<'a, T> {
let row: $row_type;
unsafe {
let ptr = self.slice_start.offset((self.row_pos + n) as isize * self.row_stride);
row = slice::$slice_from_parts(ptr, self.slice_cols);
//row = slice::$slice_from_parts(ptr, self.slice_cols);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before final merge: remove this?

Copy link
Owner Author

@AtheMathmo AtheMathmo Dec 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left these here for easier benchmarking later. Just so I can quickly check if something is funky with this iterator.

Edit: I mean to say that I will indeed remove them, thank you for flagging.

theotherphil pushed a commit to theotherphil/rulinalg that referenced this pull request Dec 4, 2016
* Implement IntoIterator for MatrixSlice

* Add simple compile test for IntoIterator

* Replace new lifetime with explicit lifetime for return value

* Implement IntoIterator for MatrixSliceMut

* Move IntoIterator implementation to iter.rs

Since it is an iterator trait it fits better there

* Test that mutable iterators are mutable
Copy link
Collaborator

@Andlon Andlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though you are not finished, I think it looks very good already! Will be a very welcome change to the API :)

I've looked really hard, but I can't find any changes to fn row(index) on BaseMatrix anywhere, yet you seem to have changed this, according to the examples... Am I blind here?

In any case, I get the impression from the examples that you still return an Option from the row/row_mut etc. functions. What's your thoughts on this...?

impl_iter_rows!(Rows, &'a [T], from_raw_parts);
impl_iter_rows!(RowsMut, &'a mut [T], from_raw_parts_mut);
// impl_iter_rows!(Rows, &'a [T], from_raw_parts);
// impl_iter_rows!(RowsMut, &'a mut [T], from_raw_parts_mut);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before final merge: remove this?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will remove this soon!

let (lower_rows, _) = iterator.size_hint();
cols = row.row.cols();

mat_data = Vec::with_capacity(lower_rows.saturating_add(1).saturating_mul(cols));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why saturating arithmetic here? I'm a bit confused about this part.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question. I'm not sure what my initial motivation was but there is no real reason.

}

for row in iterator {
assert!(row.row.cols() == cols, "Iterator slice length must be constant.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message could perhaps be a bit clearer for the user. For example, "Iterator of rows must have rows of compatible dimensions." or so?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right. I overlooked this when I copied the existing FromIterator implementation. Also - do you think we should remove the FromIterator<&'a [T]> implementation? I think we probably should.

@@ -105,7 +105,10 @@ impl<'a, T> Iterator for $rows<'a, T> {
unsafe {
// Get pointer to last row and create a slice from raw parts
let ptr = self.slice_start.offset((self.slice_rows - 1) as isize * self.row_stride);
Some(slice::$slice_from_parts(ptr, self.slice_cols))
//Some(slice::$slice_from_parts(ptr, self.slice_cols))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before final merge: Remove this?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that I kept this here to make it a little easier to benchmark later. Will remove later.

/// use rulinalg::matrix::BaseMatrixMut;
///
/// let mut mat = matrix![1.0, 2.0;
/// 3.0, 4.0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail: is the alignment off also in the rendered docs?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. It probably is.

/// 3.0, 4.0];
///
/// {
/// let mut row = mat.row_mut(1).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Option alltogether and just panic when index is out of bounds instead?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I had forgotten that we discussed this before and I think this would be better. I can't imagine a case where a user would benefit from having an Option.

@AtheMathmo
Copy link
Owner Author

I think I answered all your questions except for:

I've looked really hard, but I can't find any changes to fn row(index) on BaseMatrix anywhere, yet you seem to have changed this, according to the examples... Am I blind here?

This is because this function remained unchanged (except for the name and return type). The change was in row_unchecked which is called within this function.

@Andlon
Copy link
Collaborator

Andlon commented Dec 4, 2016

I just realized that GitHub hadn't shown me all the changes! I had to press "Load Diff" at the bottom of the screen. Never experienced that before... Will take a look at the rest too

Copy link
Collaborator

@Andlon Andlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I looked at the rest too. Not much more to add to my previous review :)

/// # fn main() {
/// use rulinalg::matrix::{Matrix, BaseMatrix};
///
/// let mat = Matrix::new(3,3, (0..9).collect::<Vec<usize>>());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is practical, but I personally find it easier to immediately understand examples if the example matrices are actually typed out when they are so small. It leaves the user to focus on the logic that's being showcased rather than having to parse the matrix construction logic. Not a big deal though!

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I agree, I'll make this change too.

} else {
None
}
}

/// Returns a mutable reference to the row of a matrix at the given index
/// without doing unbounds checking
/// without doing a bounds check.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, glad you got this one too. Let's remember to close #85 when you're done here ;)

@AtheMathmo
Copy link
Owner Author

Benchcmp output:

 name                                           old ns/iter  new ns/iter  diff ns/iter  diff % 
 linalg::iter::empty                            0            0                       0    NaN% 
 linalg::iter::mat_diag_iter_100_500            220          221                     1   0.45% 
 linalg::iter::mat_diag_iter_10_50              73           76                      3   4.11% 
 linalg::iter::mat_diag_manual_100_500          304          303                    -1  -0.33% 
 linalg::iter::mat_diag_manual_10_50            66           66                      0   0.00% 
 linalg::matrix::empty                          0            0                       0    NaN% 
 linalg::matrix::mat_create_100_100             3,408        3,402                  -6  -0.18% 
 linalg::matrix::mat_create_add_100_100         10,122       10,168                 46   0.45% 
 linalg::matrix::mat_elediv_63_1000             83,484       83,765                281   0.34% 
 linalg::matrix::mat_elemul_63_1000             35,947       36,144                197   0.55% 
 linalg::matrix::mat_mat_elemul_63_1000         35,961       36,155                194   0.54% 
 linalg::matrix::mat_mul_10_10                  799          803                     4   0.50% 
 linalg::matrix::mat_mul_128_100                252,639      253,254               615   0.24% 
 linalg::matrix::mat_mul_128_1000               2,391,344    2,406,185          14,841   0.62% 
 linalg::matrix::mat_ref_add_100_100            8,564        8,520                 -44  -0.51% 
 linalg::matrix::mat_sum_rows_and_cols_128_100  12,245       12,286                 41   0.33% 
 linalg::matrix::mat_swap_cols_0_99             75           76                      1   1.33% 
 linalg::matrix::mat_swap_rows_0_99             18           18                      0   0.00% 
 linalg::svd::svd_100_100                       74,318,932   82,978,883      8,659,951  11.65% 
 linalg::svd::svd_10_10                         171,847      174,383             2,536   1.48%

The only non-noisy changes are SVD and diag iterator which should be unchanged with this. I'm not sure why they were so significantly affected but I'm not too concerned.

@Andlon
Copy link
Collaborator

Andlon commented Dec 8, 2016

Looks great to me :)

I've noticed that the output of benchmarks can change significantly unless I completely shut down all other programs running in the background (despite running on a quadcore i7 with low load). Guess it's due to the Linux scheduler and/or memory access patterns. Might account for the noise in the SVD benchmarks.

@AtheMathmo
Copy link
Owner Author

I've noticed that the output of benchmarks can change significantly unless I completely shut down all other programs running in the background (despite running on a quadcore i7 with low load).

I see similar things myself which is why I wasn't too concerned. I'll do some vectorization tests and if those look good then hopefully one more look through should make this merge-ready!

@AtheMathmo
Copy link
Owner Author

I did a little digging into vectorization - for the most part it looks good to me. All of the operations break down into per-row operations. We may incur some cost for creating the row iterator unnecessarily (instead of the user iterating through the slice) but I don't think it will be significant.

Otherwise I can't find any cases where treating the row as a matrix instead of a &[T] would break vectorization. It's likely I've missed something but given the gain in utility here I'm happy to tick that box too.

@AtheMathmo
Copy link
Owner Author

Ok I think I'm happy with this PR. Here are my thoughts on some outstanding things:

  • impl_ops and slice module are very bloated. I'd like to fix this in a future PR.
  • IntoIterator for Row and Column. I've decided against this as we get the expected behaviour by calling iter(). Additionally there are few cases where we can consume the row (for example we cannot in iter_rows()).
  • Performance specializations for Row and Col. I'm not sure if anything is needed here really - it would help strictly but would have some code bloat so I think we should wait until this is an issue.

I'd love to hear from @Andlon , @andrewcsmith and anyone else to see if they spot anything else.

Copy link
Collaborator

@Andlon Andlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! (just slightly confused by a TODO that didn't make sense to me, assume it is out-of-date?)

@@ -326,7 +326,8 @@ mod tests {
assert!(!row.iter().take(idx).any(|&x| x > 1e-10));
assert!(!row.iter().skip(idx + 1).any(|&x| x > 1e-10));
// Assert non-negativity of diagonal elements
assert!(row.raw_slice()[idx] >= 0.0);
// TODO: Fix when indexing fixed
assert!(row[idx] >= 0.0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the TODO is out of date, and that this is actually fixed now?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Forgot to remove it.. thanks

@Andlon
Copy link
Collaborator

Andlon commented Dec 9, 2016

impl_ops and slice module are very bloated. I'd like to fix this in a future PR.

Good idea! I've also had some unpleasant run-ins with those modules.

IntoIterator for Row and Column. I've decided against this as we get the expected behaviour by calling iter(). Additionally there are few cases where we can consume the row (for example we cannot in iter_rows()).

Makes sense.

Performance specializations for Row and Col. I'm not sure if anything is needed here really - it would help strictly but would have some code bloat so I think we should wait until this is an issue.

Agree with your assessment. There are more important things (and possibly more interesting?) for us to spend time on, I think :)

@AtheMathmo
Copy link
Owner Author

I'm going to give this 12 hours in case @andrewcsmith wants to take a look then I'll be looking to merge. As there are breaking changes here I'm going to try to pick out the issues we should get in as part of 0.4 release. Ideally we wont have to wait too long before our next release.

@AtheMathmo
Copy link
Owner Author

Merging now - mostly to force myself to finish up the other 0.4 issues!

@AtheMathmo AtheMathmo merged commit 6984a59 into master Dec 11, 2016
@AtheMathmo AtheMathmo deleted the rows-are-slices branch December 17, 2016 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Typo in get_row_unchecked docs
2 participants