-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added xarray dataset accessor to record axis validation #136
Added xarray dataset accessor to record axis validation #136
Conversation
I have in the back of my mind that we might provide a require_axes({"elevation": ('spatial',), "temperature": ('spatial', 'temporal')}) or even a decorator? @require_axes({"elevation": ('spatial',), "temperature": ('spatial', 'temporal')}) But I think worth seeing what the common use patterns are first! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is, it is OK, but I'm not sure I see the need of it - even from the functionality point of view. What's wrong with:
value.attrs["core_axis"][axis_name] = ...
It is reusing an existing interface, one that will be known by xarray
users, one adequate to store the information you want, it is simple... I feel you are over-engineering something that is pretty simple, in nature.
That's true, but |
I still don't see the need, but it is you call. I'd suggest you gather the opinion from other members of the team. |
I was just playing around to check if I really object to noisy attributes in output NetCDF files or if I just like playing with the Accessor framework 😄. It turns out that you although you can create In [42]: ds = xr.DataArray([1,2,3,4], name='test')
In [43]: ds.attrs['core_axes'] = {'spatial': "Spat_CellId_Coord_Any", "temporal": "Temporal_Date", "soil_layers": None}
In [44]: ds.to_dataset().to_netcdf('Attrs_test.nc')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
...
TypeError: Invalid value for attr 'core_axes': {'spatial': 'Spat_CellId_Coord_Any', 'temporal': 'Temporal_Date', 'soil_layers': None}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple Changing to having a |
Ok, that is a strong reason for the accessor you created. So all good 👍 |
Damn. Updating the docs revealed that this is broken! The current tests only check that the output of I think the solution is to instead use a Actually, at that point, there is no need to store it in the |
That's annoying! Given that is no longer linked to the DataArray of interest, I would make all of these methods/attributes of the |
…n the validation classes applied
…test to check validation_dict return
Sorry @dalonsoa - yet another iteration... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a couple of comments on doc strings.
Sorry it took me so long for so little input...
virtual_rainforest/core/axes.py
Outdated
The function returns the validated data array and a dictionary recording which | ||
AxisValidator classes were applied to each of the core axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go in the Returns
part of the docstring, right?
Args: | ||
var_name: The name of a variable | ||
axis_name: The core axis name | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need Return
and Raises
explanations in docstring.
Description
This PR extends
xarray.DataArray
to add acore_axes
property that is used to record the axis validation applied to that data array. It uses thexarray
Accessor framework.The
core_axis
property is basically a dictionary of which AxisValidator was applied to each core axis (orNone
if no validator was applied. The property is also made callable as a simple shorthand boolean test of whether a named axis has been validated. For example, a model__init__
might confirm:I worry a little that making the
dict
callable is a bit too clever. The main reason is to keep a short API - it would otherwise have to be something likeelevation.core_axis.has_axis('spatial')
, which admittedly isn't exactly verbose.Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks