Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples #54

Draft
wants to merge 30 commits into
base: master
Choose a base branch
from
Draft

Examples #54

wants to merge 30 commits into from

Conversation

ckaran
Copy link
Collaborator

@ckaran ckaran commented Feb 5, 2021

@RustyYato I had a few minutes so I continued working on example code. It all of a sudden hit me that for intrusive collections to work right, the nodes must be pinned, so I added a PhantomPinned field to the structs. The issue is the drop code for the RedBlackTreeNode; I'm not sure if I'm doing things right in there. Would you be willing to take a quick look when you get a chance? Thanks in advance!

ckaran and others added 29 commits December 23, 2020 10:11
This crate is very much a work in progress.  My goal is to build a
series of examples showing off how the 'generic-field-projection'
crate can be used to compose highly complex data structures from much
simpler ones, and to do so in an efficient and safe manner.

Because some of this is *highly* experimental, it **should NOT** be
merged into master until significant testing and thinking about it
have been done.
…xamples crate.

This is a first draft of what I want to say in the examples crate.  As I
gain experience in using the crate, I'll have to update it.
fix links
Fix(docs): Fix weird formatting in example docs
…ME.md`

No new information, just a lot of cleanup and a little wordsmithing
here and there.
I made a mistake and put the modules in the root of the crate instead of
the 'src' directory... 😅 Now fixed!
I made a mistake and put the modules in the root of the crate instead of
the 'src' directory... 😅 Now fixed!
I used an illegal name for a module ('01_doubly_linked_list.rs') and
didn't catch it because I forgot to update main to actually compile
the code properly.  Both errors are now fixed.
…estion.

Instead of implementing the Iterator trait on SinglyLinkedListNode directly,
I created a new type (SinglyLinkedListNodeIterator) that contains the iterator's
state.  Instances of SinglyLinkedListNodeIterator can only be created using the
unsafe method SinglyLinkedListNode::iter(), which should warn users that iteration
is unsafe.
* feature: offset-based field (#35)

* initial implementation of offset-based field
* expanded on docs
* replace `Field::Name` with `Field::range`
    * This allows `Dynamic` to be `Copy` and makes it easier to implement `Field`

* fix: Moved modules into the 'src' directory.

I made a mistake and put the modules in the root of the crate instead of
the 'src' directory... 😅 Now fixed!

* fix overlap detection

previously there were cases where overlap checks didn't work
in particular if one field completely covered another

* Refactor: Extract from `TypeSet` to a new crate `typsy` (#36)

* remove TypeSet
* extract TypeSet's core utilities to a new crate
* Add a link to #39 in  `Field::range` docs
* added tests for overlap
    * fixed zst handling in overlap

* Remove support for DST types and fields (#43)

Fixes #39

* remove support for DST types and fields
* remove duplicate `Invariant` type
remove public constructor for `Invariant` type, and replace with `const`
this reduces the number of things the `derive` macro needs to know about
* add a `pin_dynamic` method to `PinToPin`
provides a safe interface to get a dynamic field.
* Make `PinToPin::field` public (no reason to be private)
* Fix `impl Field for Dynamic`

* add unchecked projections trait

* add tests for nonnull and option

* removed `ProejctTo::Project: Deref`
added `impl ProejctTo for Option`

* add the `Identity` field

* fix: Used an illegal name for a module.

I used an illegal name for a module ('01_doubly_linked_list.rs') and
didn't catch it because I forgot to update main to actually compile
the code properly.  Both errors are now fixed.

Co-authored-by: Yato <rustyyato@gmail.com>
Co-authored-by: RustyYato <krishna.sd.2012@gmail.com>
I discovered after updating these docs that I need to merge in the
master branch before continuing as large amounts of stuff that should
be able to be compiled, can't be.
This is just the basic implementation that you find on wikipedia.
SinglyLinkedListNode instances contain raw pointers.  If the instance is moved,
that will invalidate the contents of the pointer, which CANNOT be permitted to
happen.  Adding a PhantomPinned member ensures that the struct is not UnPin, which
prevents it from being moved by the compiler.
This is incomplete, but shows the direction I'm heading in.  Significant
work remains.
@ckaran ckaran requested a review from RustyYato February 5, 2021 22:34
Copy link
Owner

@RustyYato RustyYato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin doesn't really matter here, because this only exposes an unsafe API. Pin's entire purpose is to allow you to expose a safe API. However keeping PhantomPinned is a good default, because the nodes will require Pin to expose a safe API.

#[derive(Debug, Clone, Default, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct RedBlackTreeNode<T>
where
T: Debug + Clone + Default + PartialEq + Eq + PartialOrd + Ord + Hash,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It's generally a bad idea to put bounds on type definitions, it makes them harder to use for little benefit.
  2. Ord implies the other comparison traits, so putting T: Ord also implies T: PartialEq + Eq + PartialOrd

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that it's generally a bad idea, but there's a reason for it here; if the containing object wants to derive those methods as suggested by the Rust API Guidelines, then T has to implement them as well. Otherwise anyone that wants to use it would have to manually implement the traits...

I'll change to just using Ord though, that will definitely be more compact.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and the derive macro will handle that case. You don't need to restrict RedBlackTreeNode to just the types that are Debug + Clone + Ord + Hash.

For example, HashMap doesn't put any trait bounds on the type itself, only on the impl blocks that need them. i.e. HashMap::insert requires Hash + Eq but HashMap::new doesn't. This allows you to use HashMap more flexibly. (For example, the keys don't need Hash + Eq at all, so long as you provide a way to hash/compare it from outside using the nightly raw_entry API).

There is a trend in the std towards moving trait bounds as close to the actual use site as possible, and I tend to agree with this trend. Of course, there are reasons to put trait bounds in the type (for example, when that trait bound informs the layout of the type), but these cases are rare and I don't think they apply here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! OK, I'm logging another issue, and will think about how to move the trait bounds around.

That said, I think I will keep Debug as a bounds on the keys, not because they are required in general, but because these examples are intended as a learning tool, and Debug helps with debugging code. Hmmmm... that actually gives me an idea, I'll try to figure out how to pretty-print the debug output of trees.

Comment on lines +174 to +181
/// Since we're using raw pointers, we **cannot** allow these instances to
/// be moved by the compiler (dropping is OK, but moving will almost
/// certainly cause UB). For more information on why this is necessary,
/// start reading at
/// https://doc.rust-lang.org/std/pin/index.html#example-self-referential-struct
/// and be extra careful to read all the way through the explanations about
/// drop!
_pin: PhantomPinned,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since intrusive collections can't provide a safe interface, this doesn't matter as much here. But Putting PhantomPinned is a nice default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never been clear on rust's move semantics. I keep worrying that the compiler will be able to copy an instance from one location to another, invalidating some raw pointers. Since dereferencing a raw pointer is unsafe, the compiler throws the onus of preventing this on the programmer. I'm just trying to prevent potential errors... but I honestly don't know if it's necessary here!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, Pin doesn't actually have any special meaning to the compiler, so it's not going to change anything the compiler does 😄. It's a pure library construct that uses clever unsafe to prevent access to a &mut T (where T: !Unpin) with safe code. Nothing special. Rust will never move anything behind your back, but it is rather easy to move things so your concerns are well-founded.

I would stick the containing node into a box, then the only way to move it would be to dereference the box. (But this has other concerns, we may need an AliasableBox because currently Box behaves very similarly to &mut T, which we can't allow for intrusive collections).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, Pin doesn't actually have any special meaning to the compiler, so it's not going to change anything the compiler does 😄. It's a pure library construct that uses clever unsafe to prevent access to a &mut T (where T: !Unpin) with safe code. Nothing special. Rust will never move anything behind your back, but it is rather easy to move things so your concerns are well-founded.

Good to know, but I keep having these nightmarish visions of the stack getting reordered behind my back, or closures doing something weird... I know I shouldn't, but move just as a word terrifies me! 😨! Honestly, I know that it's 100% impossible in rust, I just can't shake the feeling that my raw pointers are going to stop pointing at what I think they're point to because something got moved that shouldn't...

I would stick the containing node into a box, then the only way to move it would be to dereference the box. (But this has other concerns, we may need an AliasableBox because currently Box behaves very similarly to &mut T, which we can't allow for intrusive collections).

You're right, but I'm going to defer working on that until after I have a complete red-black tree done, with tests in place.

#[derive(Debug, Clone, Default, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct RedBlackTreeNodeIterator<T>
where
T: Debug + Clone + Default + PartialEq + Eq + PartialOrd + Ord + Hash,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, let's not add unnecessary bounds

/// itself.
pub unsafe fn insert(&mut self, previous: *mut SinglyLinkedListNode) {
self.next = (*previous).next;
let me: *const SinglyLinkedListNode = self;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use &mut self here, because SinglyLinkedListNode is intrinsically shared. (By the node it's in, and by the previous node). So you have to use &self or raw pointers. I think you can't use &self because that would prevent going from self.next -> node, but check with MIRI after you complete your implementation.

I made some notes about this in my previous review: #47 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you!

/// This method is unsafe because we're modifying a raw pointer whose
/// provenance we know nothing about. If it is a bad pointer, then this will
/// lead to undefined behavior.
pub unsafe fn extract(&mut self, previous: *mut SinglyLinkedListNode) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, you need &self or raw pointers instead of &mut self

@RustyYato
Copy link
Owner

RustyYato commented Feb 6, 2021

I haven't reviewed all of this PR yet, just some of my initial thoughts.
rustfmt is still broken, so as long as it's fine locally (or if it get's fixed by the time this PR get's merged), it should be fine.

Copy link
Owner

@RustyYato RustyYato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that you wanted me to comment specifically on the Drop impl 🙃, it looks good to me, but there's a lot of subtle invariants here that need to be looked at more closely once you have a more complete implementation.

Comment on lines +221 to +225
if let Some(parent) = (*neighbor).parent {
if parent == self_pointer {
(*neighbor).parent = None;
}
}
Copy link
Owner

@RustyYato RustyYato Feb 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using if let followed by an if, you can d

Suggested change
if let Some(parent) = (*neighbor).parent {
if parent == self_pointer {
(*neighbor).parent = None;
}
}
if Some(self_pointer) == (*neighbor).parent {
(*neighbor).parent = None;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I'm looking at this code, I can see that it's quite, quite broken. The reason for the guards was because fixup_neighbor() was called on all of self's neighbors, and that fixup_neighbor() had to figure out which neighbor was being fixed up. I'll rewrite it.

/// need to use `unsafe` block below. In addition, I'm casting away the
/// mutability/immutability constraints, which can trick the compiler into
/// doing very weird things.
fn drop(&mut self) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to tell if this unsafe code is actually correct, once you have a more complete implementation we'll need to run it through MIRI with some tests to give some basic assurances. Currently, MIRI can't catch aliasing violations if you stick to raw pointers, so it won't be a perfect test, but it should get us started.

Keys are required to have certain trait bounds.  In earlier code I
spelled out all of the bounds on every usage of the key.  That was
inefficient, so I've now defined KeyTrait which gives me one place
to deal with it all.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants