Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: add short style guide #4399

Merged
merged 3 commits into from
Dec 2, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions tensorboard/data/server/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,3 +135,80 @@ You should know:
[`grpc_cli`]: /~https://github.com/grpc/grpc/blob/master/doc/command_line_tool.md
[grpc/grpc#24734]: /~https://github.com/grpc/grpc/issues/24374
[reflection]: /~https://github.com/hyperium/tonic/pull/340

## Style

Google has no official Rust style guide. In lieu of an exhaustive list of rules,
we offer the following advice:

- There is an official [Rust API Guidelines Checklist][cklist]. (Don’t miss
all the prose behind the items, and see also the “External links”
reference.) This can sometimes be helpful to answer questions like, “how
should I name this method?” or “should I validate this invariant, and, if
so, how?”. The standard library and toolchains are also great references for
questions of documentation, interfaces, or implementation: you can always
just ”go see what `String` does”.

These sources provide _guidelines_. Our code can probably afford to be less
completely documented than the standard library.

- Listen to Clippy and bias toward taking its advice. It’s okay to disable
lints by using `#[allow(clippy::some_lint_name)]`: preferably on just a
single item (rather than a module), and preferably with a brief comment.

- Standard correctness guidelines: prefer making it structurally impossible to
panic (avoid `unwrap`/`expect` when possible); use newtypes and visibility
to enforce invariants where appropriate; do not use `unsafe` unless you have
a good reason that you are prepared to justify; etc.

- Write Rust code that will still compile if backward-compatible changes are
made to protobuf definitions whose source of truth is not in this repo
(i.e., those updated by `tensorboard/compat/proto/update.sh`). This means:

- When you construct a protobuf message, spread in `Default::default` even
Copy link
Contributor

@nfelt nfelt Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little lukewarm about this rule, in particular when it comes to protobuf definitions that we clearly control ourselves (like the ones in tensorboard.data; and possibly in the future if we were to explicitly fork the event / summary protos and stop mirroring TF). Had responded on the thread, ICYMI: #4385 (comment)

I realize that it's a little hassle to have to go and update the code everywhere if we decide to change the proto, but that work isn't any different from if we'd made the proto definition a Rust struct natively and had decided to add a field, and yet we don't write all our struct literals with a trailing ..Default::default() in case they're expanded in the future. I think this is more or less a similar argument as described in https://abseil.io/tips/147 about exhaustive switch cases in C++; it seems okay to me to say that we will sign up for either 1 or 2 (3 wouldn't apply unless we were to include the mirrored TF protobufs in this as well).

I'd be more sympathetic here if we could at least make this a compile-time check or a linter check (or if Prost generated a proto-creation API that was future-proof, like a withFoo() builder pattern, that we could use instead), but if we can't, it seems like there's a decent chance eventually at least one case will slip through, and since we can't prevent that possibility we'll want to be aware of the possibility that a protobuf schema addition will break the Rust build even if we try to adhere to this rule.

Would be curious to hear thoughts from others and whether you think there's a particularly ironclad reason we should strive to provide protobuf changes with this kind of guarantee. (FWIW, the only backwards-compatibility guarantee I'm aware of with protobuf is at the wire level; I am not aware of any general guarantees about backwards-compatibility for generated code when the definition changes, though I'll concede that in other generated code I've worked with, at least adding a field has always been backwards-compatible.)

Copy link
Contributor Author

@wchargin wchargin Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had indeed missed that; thanks for surfacing it!

It’s fine with me to waive this suggestion for protos that we clearly
control ourselves: i.e., those whose source of truth is in this repo.

My main goal with this suggestion was that running update.sh to pull
in new TensorFlow protos should be a safe operation, because (a) it may
be run by people who aren’t familiar with this code, and (b) it tends to
happen around release time, so we would prefer not to rush any changes.
That’s why I see it differently than changing a normal Rust struct: in
that case, you’re deliberately making a change to the Rust code rather
than just making what should be an automatable change.

it seems like there's a decent chance eventually at least one case will slip through, and since we can't prevent that possibility we'll want to be aware of the possibility

Agreed, hence my tips for handling such a situation in this document.

[from link] The other alternative that came to mind would be to initialize structs to default-empty, then mutate to populate fields; needing mutation is a downside although I think in some of the more convoluted multi-layer struct cases this may be significantly more concise.

I don’t mind the mutation, but the problem that you run into here is the
loss of type information:

        let req = Request::new(data::ReadScalarsRequest {
            experiment_id: "123".to_string(),
            plugin_filter: Some(data::PluginFilter {
                plugin_name: "scalars".to_string(),
                ..Default::default()
            }),
            downsample: Some(data::Downsample {
                num_points: 1000,
                ..Default::default()
            }),
            ..Default::default()
        });

        let mut req = data::ReadScalarsRequest::default();
        req.experiment_id = "123".to_string();
        req.plugin_filter = Some(data::PluginFilter::default());
        req.plugin_filter.as_mut().unwrap().plugin_name = "scalars".to_string();
        //               ^^^^^^^^^^^^^^^^^^
        req.downsample = Some(data::Downsample::default());
        req.downsample.as_mut().unwrap().num_points = 1000;
        //            ^^^^^^^^^^^^^^^^^^
        let req = Request::new(req);

…which gets worse with more structure (deeper nesting, repeated fields).
I guess you can kind of roll your own nested builders thanks to Rust’s
expression blocks:

        let req = Request::new({
            let mut req = data::ReadScalarsRequest::default();
            req.experiment_id = "123".to_string();
            req.plugin_filter = Some({
                let mut pf = data::PluginFilter::default();
                pf.plugin_name = "scalars".to_string();
                pf
            });
            req.downsample = Some({
                let mut ds = Some(data::Downsample::default());
                ds.num_points = 1000;
                ds
            });
            req
        })

…and while this does literally about the ..Default::default() spreads,
it doesn’t really seem to satisfy the spirit of your objection. That is:
it’s still a bit noisy.

$0.02: I personally don’t find the ..Default::default() spam that
bad, but I accept that reasonable people may, and if there were shorter
sugar then I would happily use it.

We could macro up some builders:

        let req = Request::new(proto!(data::ReadScalarsRequest {
            experiment_id: "123".to_string(),
            plugin_filter: proto!(data::PluginFilter, {
                plugin_name: "scalars".to_string(),
            }),
            downsample: proto!(data::Downsample, {
                num_points: 1000,
            }),
        }));

…and while I won’t run out and do this unilaterally, I also wouldn’t mind it
if it would make you happier and if it doesn’t break type-directed
autocompletion (which I think I can wrangle).

I am not aware of any general guarantees about backwards-compatibility for generated code when the definition changes

You kind of implicitly have this on any proto_library with public
visibility in Google’s monorepo, though, right? I agree that they don’t
apply to all wire-compatible changes (renames, most obviously), but my
impression has been that people feel free to add fields.

Also, https://google.aip.dev/180#adding-components says that adding
“components” (incl. fields) is permitted within the same major version,
which implies that they are source-compatible, since “[code] written
against a previous version must compile against a newer version”.

What do you think about keeping the recommendation just for protos whose
source of truth is not in this repo? i.e., those not affected by the
update.sh script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the macro thing for 15 minutes. A bit harder than I thought. Not
pursuing further at this time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine by me as you have it, and thanks for the pointer to https://google.aip.dev/180#adding-components - that does indeed formalize this rule. That being said, I think our problem is a bit more fundamental than one that could be patched up by an optional convention (whether macro-assisted or not): Prost generated code in combination with Rust struct literals having to be exhaustive just fundamentally isn't compatible with the API, since it's trivially possible to write code against the generated API that breaks when you add a field.

I think we would need rust-lang/rfcs#1806 (see also related rust-lang/rfcs#1458) to fix this, but it's been postponed :( Either that or some way that Prost structs could disallow use struct literal initialization entirely in favor of forcing you through some sort of builder API, but I'm guessing that eliminates a lot of the "idiomatic Rust flavor" that Prost was trying to offer in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay; thanks! Agreed, I think; builders are fairly common in Rust
(stdlib examples: Command, thread::Builder, DirBuilder,
probably more), though certainly struct initialization is simpler.

I’ve opened a Prost feature request for builders. If maintainers
seem open to it, maybe I’ll take a stab at implementation.

if you populate all the fields, in case more fields are added:

```rust
let plugin_data = pb::summary_metadata::PluginData {
plugin_name: String::from("scalars"),
content: Vec::new(),
..Default::default()
};
```

This [contradicts `clippy::needless_update`][clippy-nu], so we disable
that lint at the crate level.

- When you consume a protobuf enum, include a default case:

```rust
let class_name = match data_class {
DataClass::Scalar => "scalar",
DataClass::Tensor => "tensor",
DataClass::BlobSequence => "blob sequence",
_ => "unknown", // matching against `_`, not `DataClass::Unknown`
};
```

This way, whoever is updating our copies of the protobuf definitions doesn’t
also have to context-switch in to updating Rust code. This rule doesn’t
apply to `tensorboard.data` protos, since we own those.

We don’t have a compile-time check for this, so just try to be careful.

If you’re reading this because a protobuf change _has_ caused Rust failures,
here are some tips for fixing them:

- For struct initializers: add `..Default::default()` (no trailing comma),
as in the example above.
- For `match` expressions: add a `_ => unimplemented!()` branch, as in the
example above, which will panic at runtime if it’s hit.
- For other contexts: if there’s not an obvious fix, run `git blame` and
ask the original author, or ask another Rust developer.

[cklist]: https://rust-lang.github.io/api-guidelines/checklist.html
[clippy-nu]: /~https://github.com/rust-lang/rust-clippy/issues/6323

When in doubt, you can always ask a colleague or the internet. Stack Overflow
responses on Rust tend to be both fast and helpful, especially for carefully
posed questions.