-
Notifications
You must be signed in to change notification settings - Fork 5
Add CI and ability to control simulation dates #35
Add CI and ability to control simulation dates #35
Conversation
… roms_marbl example case
include coverage badge
- Case now has a "start_date" and "end_date" attribute specified at initialisation, even when reading from a blueprint file (the user's desired run dates are not relevant to the blueprint) - Case now has "valid_start_date" and "valid_end_date" attributes which /are/ defined as part of the blueprint: this is the full range of dates supported by the blueprint - When instantiating a case, the user's start and end dates are checked against the valid start and end dates - InputDataset now has an optional "start_date" and "end_date" attribute/yaml entry to avoid downloading unnecessary files (e.g. forcing files outside the user's desired date range) during Case.setup Still to do: - Modify the runtime namelists using these start and end dates on call to .run() so the simulation actually behaves consistently with the Case it is represented by Other: - Bugfix where ruff was removing an "unused" import of cstar_local_config in environment.py: have added a noqa comment to prevent this for now
- Blueprint discretization entry now has a "time_step" field from which the number of time steps to run with is calculated - ROMSComponent.pre_run() now modifies a template namelist, replacing a series of placeholder strings with paths only to any files needed for the chosen simulation dates - ROMSComponent.run() is now called with an "n_time_steps" arg, based on which it modifies the "__NTIMES_PLACEHOLDER__" string in the namelist at runtime
README.md
Outdated
@@ -1,3 +1,4 @@ | |||
[![codecov](https://codecov.io/github/dafyddstephenson/C-Star/branch/automated_testing/graph/badge.svg?token=Z0L4U76WSG)](https://codecov.io/github/dafyddstephenson/C-Star) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the path so it reflects the coverage of the CWorthy-ocean/C-Star main branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! I think it's done now but will have to check it's worked after the merge
Nice @dafyddstephenson ! Yes I would love to see some of these PRs broken up haha
This is interesting - is it the first entry in the blueprint that has anything to do with whether a blueprint is "validated"? Because if so then CWorthy-ocean/C-Star#8 is relevant. We should think about whether time domain validity is defined for a whole blueprint, or per-component/dataset or what. I'll suggest that only an entire blueprint can be meaningfully validated, and so we should have a separate top-level section in the blueprint that contains everything relevant to validation status, including
Also interesting. I think that yes VirtualiZarr could be useful as a way to make it easier to access important information about an input dataset without having to open all the files in it. However while it's tempting to imagine putting cstar-relevant data in the metadata of a virtual zarr store pointing at input netCDF files, I'm not sure that would be the correct separation of concerns (because that additional data is not really part of the dataset, it's our judgement about the dataset). |
😬 ! This was supposed to be at least two... I'll hold off on any further development until that repo cleaning issue is closed I think. sorry!
It doesn't strictly relate to validation status, though I would say that these entries should be a requirement for validation. In this case for example forcing files are only defined for the year of 2012, so even though this blueprint isn't "validated" per se, it has a natural set of time boundaries that set the valid date range.
Indeed, I wouldn't suggest we add C-Star-related metadata to input datasets, but I think things like just accessing the time bounds would mean we could set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, let's just separate out any important TODOs as issues then please merge!
…ction in place of copy-pasted code in AdditionalCode.get() and BaseModel.get()
…l_environmet() which is called by environment.py
…l_environmet() which is called by environment.py [typo fix]
Have now:
Have also added some more |
Sorry for another big PR, I was hoping to just work on this in the background thinking the repo cleanup (issue CWorthy-ocean/C-Star#2) would be done a lot more quickly, but I think it's now probably more appropriate to get this merged first.
Summary of changes
1. Introduction of CI:
A GitHub runner now configures the environment, installs C-Star, creates a
roms_marbl_example
case from a blueprint, and callssetup
,build
,pre_run
,run
, andpost_run
on it. The case is run for 30 model minutes.2. Add ability to control simulation dates
start_date
andend_date
attributes added toCase
andInputDataset
objects, the latter to determine whether, e.g. a specific forcing file is needed and proceed accordingly.valid_start_date
andvalid_end_date
attributes are also defined onCase
objects, to specify the valid date range supported and check the user's provided start and end dates are compatible.time_step
attribute added toROMSComponent
objects.Full description of changes:
1. Introduction of CI:
Changes:
New files
.github/workflows/tests.yaml
which runs...cstar_ocean/tests/test_roms_marbl_example.py
which runs theroms_marbl_example
blueprint for 30 model minutesci/environment.yml
to manage conda env on github runnercoverage.toml
file for code coverage2. Add ability to control simulation dates
Modifications:
cstar_case.py
:start_date
,end_date
,valid_start_date
, andvalid_end_date
attributes to Case class. These can be either datetime objects or strings, but strings are parsed bydateutil.parser
to create datetime objects. The former belong to the specific Case object created by the user and are not written to or read from a blueprint yaml file, whereas the latter belong to the blueprint. If no range of valid dates is provided when creating a Case object, the user is warned. If no start and end dates are provided, the default behaviour is to set them as the earliest and latest valid dates, respectively. If these are also not provided, an error is raised.start_date
andend_date
fit within the valid date rangestart_date
andend_date
args toCase.from_blueprint()
method signature as these are not present in the blueprint.Case.from_blueprint()
parsing the valid date range from the yaml file.start_date
andend_date
entries toinput_dataset
fields in the YAML file: as we only want to fetch netcdf files whose own date range overlaps the Case's date range (e.g. we only want the january surface forcing file if we're only running for the first week of the year), these attributes help determine that. NOTE: we are here adding more metadata to the blueprint that should really be read from the dataset itself. It seems like @TomNicholas VirtualiZarr work could be relevant here?Case.check_is_setup()
method to only check for the local presence of input datasets whose date range overlaps the Case's date rangeCase.setup()
method to only callInputDataset.get(self.caseroot)
if an input dataset's date range overlaps the Case's date rangeCase.run()
to provide then_time_steps
argument toROMSComponent.run()
based on the time step and date range.component.py
:time_step
attribute toROMSComponent
subclassROMSComponent.pre_run()
to only partition input datasets that exist locally (based on new behaviour ininput_dataset.py
where input datasets are only fetched locally if they overlap the user's chosen simulation dates)InputDataset.local_path
inROMSComponent.pre_run()
to reflect changes ininput_dataset
where the local path attribute now provides the full path to the file, not just its parent directoryROMSComponent.pre_run()
to modify the ROMS runtime namelist. As ROMS opens all input datasets specified in the namelist on init, we need to dynamically modify the namelist to only describe datasets for this case that exist locally (i.e. ones which overlap the simulation period). This is done by including a "template" namelist with placeholder strings, e.g.__INITIAL_CONDITION_FILE_PLACEHOLDER__
that are replaced duringpre_run()
with the correct path to the IC file by C-Star.n_time_steps
argument toROMSComponent.run()
. The argument is optional, and defaults to 1, but the user will be warned if this behaviour is invoked.ROMSComponent.run()
to modify the namelist, adding the number of timesteps by replacing the placeholder string__NTIMES_PLACEHOLDER__
input_dataset.py
:start_date
andend_date
attributes toInputDataset
class (describing for example the earliest and latest entries in a forcing file). Can bedatetime
orstr
, but strings are parsed intodatetimes
bydateutil.parser
InputDataset.local_path
attribute to reflect full path to dataset file, not just parent directory, consistently with other path-related attributes elsewhere in C-Star