Skip to content

Commit

Permalink
Update documentation (#81)
Browse files Browse the repository at this point in the history
* updated getting_started

* included parameter inheritance image

* included overwrite information
  • Loading branch information
DSchreyer authored Jan 14, 2025
1 parent 0fbd745 commit 82b6890
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 8 deletions.
65 changes: 57 additions & 8 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,17 @@

## `dso init` -- Initialize a project

`dso init` initializes a new project in your current directory.
`dso init` initializes a new project in your current directory. In the context of DSO, a project is a structured environment where data science workflows are organized and managed.

```{command-output} dso init test_project --description "This is a test project"
To initialize a project use the following command:

```bash
# To initialize a project called "test_project" use the following command
dso init test_project --description "This is a test project"
```

It creates the root directory of your project with all the necessary configuration files for `git`, `dvc`, `uv` and
`dso` itself:

```{command-output} ls -a test_project
It creates the root directory of your project with all the necessary configuration files for `git`, `dvc`, `uv` and `dso` itself.

```

## `dso create` -- Add folders or stages to your project

Expand Down Expand Up @@ -46,8 +45,58 @@ stage
|-- report # contains HTML Report generated by Analysis Scripts
```

## Writing configuration files
## Configuration files

The config files in a _project_, _folder_, or _stage_ are the cornerstone of any reproducible analysis, serving as a single point of truth. Additionally, using config files reduces the modification time needed for making _project_/_folder_-wide changes.

Config files are designed to contain all necessary parameters, input, and output files that should be consistent across the analyses. For this purpose, configurations can be defined at each level of your project in a `params.in.yaml` file. These configurations are then transferred into the `params.yaml` files when using `dso compile-config`.

A `params.yaml` file consolidates configurations from `params.in.yaml` files located in its parent directories, as well as from the `params.in.yaml` file in its own directory. For your analysis, reading in the `params.yaml` of the respective stage gives you then access to all the configurations.

The following diagram displays the inheritance of configurations:

```{eval-rst}
.. image:: ../img/dso-yaml-inherit.png
:width: 60%
```

### Writing configuration files
To define your configurations in the `params.in.yaml` files, please adhere to the yaml syntax. Due to the implemented configuration inheritance, relative paths need to be resolved within each __folder__ or __stage__. Therefore, relative paths need to be specified with `!path`.

An example `params.in.yaml` can look as follows:

```bash
thresholds:
fc: 2
p_value: 0.05
p_adjusted: 0.1

samplesheet: !path "01_preprocessing/input/samplesheet.txt"

metadata_file: !path "metadata/metadata.csv"

file_with_abs_path: "/data/home/user/typical_analysis_data_set.csv"

remove_outliers: true

exclude_samples:
- sample_1
- sample_2
- sample_6
- sample_42
```

### Compiling `params.yaml` files

All `params.yaml` files are automatically generated using:

```bash
dso compile-config
```

### Overwriting Parameters

When multiple `params.in.yaml` files (such as those at the project, folder, or stage level) contain the same configuration, the value specified at the more specific level (e.g., stage) takes precedence over the value set at the broader level (e.g., project). This makes the analysis adaptable and enhances modifiability across the project.
## Implementing a stage

### R
Expand Down
Binary file added img/dso-yaml-inherit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 82b6890

Please sign in to comment.