-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
178 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Metadata <img align="right" width="73" height="45" src="https://raw.githubusercontent.com/tskit-dev/administrative/main/logos/svg/tskit-rust/Tskit_rust_logo.eps.svg"> | ||
|
||
Tables may contain additional information about rows that is not part of the data model. | ||
This metadata is optional. | ||
Tables are not required to have metadata. | ||
Tables with metadata do not require that every row has metadata. | ||
|
||
The next sections showcase the metadata API. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# Defining metadata types in rust | ||
|
||
A key feature of the API is that metadata is specified on a per-table basis. | ||
In other words, a type to be used as node metadata implements the `tskit::metadata::NodeMetadata` trait. | ||
|
||
Using the `tskit` cargo feature `derive`, we can use procedural macros to define metadata types. | ||
Here, we define a metadata type for a mutation table: | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:metadata_derive}} | ||
``` | ||
|
||
We require that you also manually specify the `serde` derive macros because the metadata API | ||
itself does not depend on `serde`. | ||
Rather, it expects raw bytes and `serde` happens to be a good way to get them from your data types. | ||
|
||
The derive macro also enforces some helpful behavior at compile time. | ||
You will get a compile-time error if you try to derive two different metadata types for the same rust type. | ||
The error is due to conflicting implementations for a [supertrait](https://doc.rust-lang.org/rust-by-example/trait/supertraits.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Metadata schema | ||
|
||
For useful data interchange with `tskit-python`, we need to define [metadata schema](https://tskit.dev/tskit/docs/stable/metadata.html). | ||
|
||
There are currently several points slowing down a rust API for schema: | ||
|
||
* It is not clear which `serde` formats are compatible with metadata on the Python side. | ||
* Experiments have shown that `serde_json` works with `tskit-python`. | ||
* Ideally, we would also like a binary format compatible with the Python `struct` | ||
module. | ||
* However, we have not found a solution eliminating the need to manually write the | ||
schema as a string and add it to the tables. | ||
Various crates to generate JSON schema from rust structs return schema that are over-specified | ||
and fail to validate in `tskit-python`. | ||
* We also have the problem that we will need to add some Python to our CI to prove to ourselves | ||
that some reasonable tests can pass. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Metadata and tables | ||
|
||
Let us create a table and add a row with our mutation metadata: | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:add_mutation_table_row_with_metadata}} | ||
``` | ||
|
||
Meta data is optional on a per-row basis: | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:add_mutation_table_row_without_metadata}} | ||
``` | ||
|
||
We can confirm that we have one row with, and one without, metadata: | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:validate_metadata_row_contents}} | ||
``` | ||
|
||
Fetching our metadata from the table requires specifying the metadata type. | ||
The result of a metadata retrieval is `Option<Result, TskitError>`. | ||
The `None` variant occurs if a row does not have metadata or if a row id does not exist. | ||
The error state occurs if decoding raw bytes into the metadata type fails. | ||
The details of the error variant are [here](https://docs.rs/tskit/latest/tskit/error/enum.TskitError.html#variant.MetadataError). | ||
The reason why the error type holds `Box<dyn Error>` is that the API is very general. | ||
We assume nothing about the API used to encode/decode metadata. | ||
Therefore, the error could be anything. | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:metadata_retrieval}} | ||
``` | ||
|
||
```rust, noplayground, ignore | ||
{{#include ../../tests/book_metadata.rs:metadata_retrieval_none}} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
#[cfg(feature = "derive")] | ||
#[test] | ||
fn book_mutation_metadata() { | ||
// ANCHOR: metadata_derive | ||
#[derive(serde::Serialize, serde::Deserialize, tskit::metadata::MutationMetadata)] | ||
#[serializer("serde_json")] | ||
struct MutationMetadata { | ||
effect_size: f64, | ||
dominance: f64, | ||
} | ||
// ANCHOR_END: metadata_derive | ||
|
||
// ANCHOR: add_mutation_table_row_with_metadata | ||
let mut tables = tskit::TableCollection::new(50.0).unwrap(); | ||
|
||
let md = MutationMetadata { | ||
effect_size: 1e-3, | ||
dominance: 1.0, | ||
}; | ||
|
||
let mut_id_0 = tables | ||
.add_mutation_with_metadata( | ||
0, // site id | ||
0, // node id | ||
-1, // mutation parent id | ||
0.0, // time | ||
None, // derived state is Option<&[u8]> | ||
&md, // metadata for this row | ||
) | ||
.unwrap(); | ||
// ANCHOR_END: add_mutation_table_row_with_metadata | ||
|
||
// ANCHOR: add_mutation_table_row_without_metadata | ||
let mut_id_1 = tables | ||
.add_mutation( | ||
0, // site id | ||
0, // node id | ||
-1, // mutation parent id | ||
0.0, // time | ||
None, // derived state is Option<&[u8]> | ||
) | ||
.unwrap(); | ||
// ANCHOR_END: add_mutation_table_row_without_metadata | ||
|
||
// ANCHOR: validate_metadata_row_contents | ||
assert_eq!( | ||
tables | ||
.mutations_iter() | ||
.filter(|m| m.metadata.is_some()) | ||
.count(), | ||
1 | ||
); | ||
assert_eq!( | ||
tables | ||
.mutations_iter() | ||
.filter(|m| m.metadata.is_none()) | ||
.count(), | ||
1 | ||
); | ||
// ANCHOR_END: validate_metadata_row_contents | ||
|
||
// ANCHOR: metadata_retrieval | ||
let fetched_md = match tables.mutations().metadata::<MutationMetadata>(mut_id_0) { | ||
Some(Ok(m)) => m, | ||
Some(Err(e)) => panic!("metadata decoding failed: {:?}", e), | ||
None => panic!( | ||
"hmmm...row {} should have been a valid row with metadata...", | ||
mut_id_0 | ||
), | ||
}; | ||
|
||
assert_eq!(md.effect_size, fetched_md.effect_size); | ||
assert_eq!(md.dominance, fetched_md.dominance); | ||
// ANCHOR_END: metadata_retrieval | ||
|
||
// ANCHOR: metadata_retrieval_none | ||
// There is no metadata at row 1, so | ||
// you get None back | ||
assert!(tables | ||
.mutations() | ||
.metadata::<MutationMetadata>(mut_id_1) | ||
.is_none()); | ||
|
||
// There is also no metadata at row 2, | ||
// because that row does not exist, so | ||
// you get None back | ||
assert!(tables | ||
.mutations() | ||
.metadata::<MutationMetadata>(2.into()) | ||
.is_none()); | ||
// ANCHOR_END: metadata_retrieval_none | ||
} |