Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local additional code #55

Merged

Conversation

dafyddstephenson
Copy link
Contributor

@dafyddstephenson dafyddstephenson commented Aug 29, 2024

-> This PR branches from #52 , do not merge until merging #52 <-

Closes #31.

Summary

This PR introduces support for local additional code, as well as improving the handling of model namelists that are to be modified by C-Star (the latter changes being necessary to test the former). Local AdditionalCode instances should be created using a path to a directory, instead of a remote repository, but the structure of this directory should be the same:

    <additional_code_dir>                                                                                                                                                                                                                                                                          
       ├── namelists                                                                                                                                                                                                                                                                               
       |      └ <base_model_name>                                                                                                                                                                                                                                                                  
       |              ├ <namelist_file_1>                                                                                                                                                                                                                                                          
       |              |       ...                                                                                                                                                                                                                                                                  
       |              └ <namelist_file_N>                                                                                                                                                                                                                                                          
       └── source_mods                                                                                                                                                                                                                                                                             
              └ <base_model_name>                                                                                                                                                                                                                                                                  
                      ├ <source_code_file_1>                                                                                                                                                                                                                                                       
                      |       ...                                                                                                                                                                                                                                                                  
                      └ <source_code_file_N>      

Meanwhile, calls to AdditionalCode.get() now check whether namelist filenames have _TEMPLATE as a suffix. If so, a new copy without this suffix is created alongside the template file. The template is tracked with AdditionalCode.namelists as before, the copy is tracked using a new attribute AdditionalCode.modified_namelists. This is an improvement on previous handling but not a viable long term solution. I have raised Issue #56 in response.

List of changes

Changes directly required for local additional code support:

  • Move DataSource out of base/input_dataset.py and into dedicated module base/datasource.py
  • Expand DataSource to check if location is a directory or local repository
  • Rename AdditionalCode.source_repo:
    • source_repo attribute is renamed to AdditionalCode.source, type changed from str -> DataSource
    • source_repo argument to AdditionalCode.__init__() is renamed to location (str).
      This location is used by __init__ to create a DataSource instance and set the AdditionalCode.source attribute.
  • Similarly rename source and source_repo to location in:
    • Case.from_blueprint()
    • Case.persist()
    • cstar_blueprint_roms_marbl_example.yaml
  • Rename "local_path" to "local_dir" in AdditionalCode.get to avoid confusion with AdditionalCode.local_path
  • Make AdditionalCode.checkout_target optional (in case source is not a repository)
  • Update AdditionalCode.get(). This previously cloned a repository to a temporary location and then moved the files needed by the AdditionalCode instance. It now does this only if AdditionalCode.source.location_type is url and AdditionalCode.source.source_type is repository. If location_type is path then the files are simply copied from the provided path.
  • The test_roms_marbl_example.py script now also uses local additional code as well as local input datasets in its second (local) stage

Related changes associated with namelist handling:

These changes were necessary to implement the test described above, as previously the template namelist file was edited in place and so couldn't be re-used by a second case.

  • Add an attribute AdditionalCode.modified_namelists to keep track of edited namelists based on templates
  • Add logic in AdditionalCode.get() that makes copies of any namelist files with the suffix _TEMPLATE to a file without this suffix and adds it to AdditionalCode.modified_namelists, e.g.:
    AdditionalCode.namelists = ["roms.in_TEMPLATE",] -> AdditionalCode.modified_namelists = ["roms.in",]
  • Update ROMSComponent.pre_run() and ROMSComponent.run() to reflect this namelist handling

Other/bugfixes:

  • Update InputDataset.local_path to point to the symlink created by InputDataset.get() after call (if dataset is local, addressing an inconsistency in Better input dataset hash handling #52 )
  • update pre-commit config file to rename cstar_ocean -> cstar
  • swap out os for pathlib in additional_code.py and datasource.py
  • minor changes to case.py with additional type checking
  • Update docstrings, __str__s and example notebook to reflect above changes

“Dafydd added 4 commits August 22, 2024 17:25
- Move DataSource out of base/input_dataset.py and into dedicated module base/datasource.py
- Expand DataSource to check if location is a directory or local repository
- Rename AdditionalCode.source_repo to AdditionalCode.source and change type to DataSource
- Make checkout_target an optional attribute to AdditionalCode (default None)
- Wrap existing AdditionalCode.get() inside conditional check for 'url','repository'
- Rename 'source' and 'source_repo' to 'location' in Case.from_blueprint, Case.persist, and cstar_blueprint_roms_marbl_example.yaml
Still todo:
- update docstrings
- update example notebook
- update AdditionalCode.get() to support copying local files
- swap out os for pathlib in additional_code.py and datasource.py
- fix typo in input_dataset.py
- minor changes to case.py with additional type checking
…st and modify/run with the copy (roms/component.py)

- Update pre-commit file to replace remaining instance of cstar_ocean with cstar
- Update InputDataset.local_path to point to the symlink created by InputDataset.get() after call (if dataset is local)
- Update test_roms_marb_example.py to also include local paths to additional code in 2nd case
…ional code file support

- Rename "source" (DataSource) to "location" (str) in __init__ for AdditionalCode and InputDataset.
  The method uses this arg to create the "source" attribute using source=DataSource(location).
  This avoids requiring the user to create and supply a DataSource object which itself is just
  instantiated from a location string.
- Rename "local_path" to "local_dir" in AdditionalCode.get to avoid confusion with AdditionalCode.local_path
- Add an attribute AdditionalCode.modified_namelists to keep track of edited namelists based on templates
- Add logic in AdditionalCode.get() that makes copies of any namelist files with the suffix _TEMPLATE
  to a file without this suffix and adds it to AdditionalCode.modified_namelists, e.g.:
  AdditionalCode.namelists = ["roms.in_TEMPLATE",] -> AdditionalCode.modified_namelists = ["roms.in",]
- Update ROMSComponent.pre_run() and ROMSComponent.run() to reflect this namelist handling
- Minor changes to __str__ in component.py (addressing inconsistencies)
- Update cstar_example_notebook.ipynb to reflect all of the above changes
@dafyddstephenson dafyddstephenson added this to the alpha milestone Aug 29, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dafyddstephenson dafyddstephenson added the enhancement New feature or request label Aug 29, 2024
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking through this and so far it looks good. We should probably chat about #56 at some point though.

@dafyddstephenson dafyddstephenson merged commit 2ef1478 into CWorthy-ocean:main Sep 4, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support local paths to additional code
2 participants