Coordinate UX #11

lpilz · 2021-11-15T16:33:20Z

I think this is pretty straightforward as we just need the lat, lon and time coordinates, all other can be discarded. Unstaggering will be done in the variable initialization. However, we should be aware of moving-nest runs and keep the time-dependence of lat and lon for these occasions.

The text was updated successfully, but these errors were encountered:

jthielen · 2021-11-20T00:50:47Z

I think this is pretty straightforward as we just need the lat, lon and time coordinates, all other can be discarded.

I disagree with that list, as we also need the projection coordinates (south_north and west_east) and the vertical coordinate (bottom_top). However, it's still straightforward, as hopefully #14 indicates!

Unstaggering will be done in the variable initialization.

Sounds good to me to do the destaggering of coordinates at the same time as data variables.

However, we should be aware of moving-nest runs and keep the time-dependence of lat and lon for these occasions.

In what little xwrf currently has, it assumes that we don't have a moving-nest. This simplifies things greatly with respect to dimension coordinates, so that's what I've stuck with in #14. That being said, moving nests are definitely important to handle eventually, but I'd vote to not worry about that for an initial release milestone.

lpilz · 2021-11-23T18:09:23Z

You're right, I was maybe a bit rash. What I meant was: if we decide to destagger, we don't need to keep the staggered variables.

To me it is important to obtain clarity in the xarray API. The user should be able to see on a glance which coordinates the various diagnostics and data are on. For that we need the projection coordinates (if we're not in a lat/lon scenario). However, I think your PR makes good headway into this direction :)

That being said, moving nests are definitely important to handle eventually, but I'd vote to not worry about that for an initial release milestone.

Seconding the prioritization :)

lpilz · 2022-01-25T08:50:12Z

Maybe we should use #31 as a reason to discuss which coordinates we will have in the dataset at the end.

For the time coordinate, e.g., information in wrfout files is redundant. We have the Times coordinate, which contains timestamps (in string form) and we have XTIME, which contains timedeltas (minutes since simulation start). To avoid too much redundant data, I would propose to just drop XTIME and only keep Time.

The same goes for the spatial coordinates. I would suggest making dropping redundant information a rule. So if it were not too much hassle to compute on the fly, I would propose to keep south_north, west_east and bottom_top (+ staggered? #35) and just drop XLAT, XLONG etc.. This would simply entail adding tutorials on how to generate the lats and longs.

What do you say to these suggestions?

lpilz · 2022-01-25T08:55:18Z

I just remembered this might also apply to data variables. For use_theta_m = 0, the THM and the T field are identical.

kmpaul · 2022-01-26T19:24:51Z

@lpilz: This sounds reasonable to me.

jthielen · 2022-01-26T19:56:00Z

For the time coordinate, e.g., information in wrfout files is redundant. We have the Times coordinate, which contains timestamps (in string form) and we have XTIME, which contains timedeltas (minutes since simulation start). To avoid too much redundant data, I would propose to just drop XTIME and only keep Time.

That makes sense to me for this kind of one-dimension coordinate (so long as Times is parsed into a datetime dimension coordinate as previously intended)!

The same goes for the spatial coordinates. I would suggest making dropping redundant information a rule. So if it were not too much hassle to compute on the fly, I would propose to keep south_north, west_east and bottom_top (+ staggered? #35) and just drop XLAT, XLONG etc.. This would simply entail adding tutorials on how to generate the lats and longs.

I'd disagree with dropping latitude and longitude, however. While it is true that they can be re-generated (with careful handling of datums), it is not always trivial or cheap to compute, particularly with large/dense domains. Having both dimension coordinates and latitudes/longitudes is important for many workflows; a key example of this is projection-correct dynamical calculations (xref Unidata/MetPy#893), where you'll simultaneously need x/west_east, y/south_north, longitude, and latitude. I'd think we'd be imposing an unnecessary burden on users if we decided to always drop latitude and longitude, especially given there isn't a significant cost to keeping them around (as they aren't dimension coordinates, they can be automatically lazy-loaded just like any data variable).

Also, generalizing from that, I don't think always "dropping redundant information" is a good rule. For example, this rule would mean

drop/not include pressure, since it is redundant from surface pressure, model top pressure, and sigma coordinate
drop/not include geopotential height, since it is redundant from geopotential
drop all the map factors, since they are redundant with the horizontal coordinates

I think redundant data, so long as it is lazy-loaded, still has a place and would be appreciated by users. Perhaps a better rule would be "dropping redundant 0-D and 1-D information"?

kmpaul · 2022-01-26T22:49:11Z

@jthielen: I think you've hit the nail on the head by pointing out that the data is lazy-loaded. With that in mind, I'm more in favor of not dropping anything unless it's presence creates a problem. I'm not even sure it's necessary to remove the redundant time coordinates, now that I think about it.

This is one of those things that seems to be an issue of "trying to guess what the user wants." And my experience over the years is that you should always assume that the user wants everything. So, I'm now of the opinion that we shouldn't drop anything unless it becomes a problem for us to keep it.

lpilz · 2022-01-27T16:13:44Z

@jthielen you are completely right regarding data variables. My point was mainly on coordinates, though. Here, I feel that we should not include too much information and keep coordinates and dimensions as close to another as possible to avoid confusing the users who are mostly coming from earth and natural sciences. However, I do see your point regarding lat/long data. I didn't know there were applications which actually directly used this info.

My idea for how this could look when all features are implemented would be:

>>> ds = xr.open_dataset('wrfout_d03_2018-10-15_00:00:00').xwrf.process()
>>> list(ds.coords)
['XLAT',
 'XLONG',
 'south_north',
 'west_east',
 'bottom_top',
 'Time']

and

>>> ds = xr.open_dataset('wrfout_d03_2018-10-15_00:00:00').xwrf.process(destagger=False)
>>> list(ds.coords)
['XLAT',
 'XLONG',
 'south_north',
 'west_east',
 'south_north_stag',
 'west_east_stag',
 'bottom_top',
 'bottom_top_stag',
 'Time']

What do you say to these?

@kmpaul if you really want to retain XTIME, I would propose to parse it as a np.timedelta64 dtype. What do you think of this idea?

kmpaul · 2022-01-27T16:25:29Z

@lpilz: Yeah. If we can process the time-like dimensions into timedelta64 or datetime64 datatypes, that would be ideal. Just keep in mind that if any changes are made to the raw data (i.e., the DataArray data doesn't exactly match the corresponding data in the original file), then the modification should be stored in the encoding.

lpilz added the enhancement New feature or request label Nov 15, 2021

jthielen mentioned this issue Nov 20, 2021

Implementation of salem-style x, y, and z coordinates #14

Merged

3 tasks

jthielen mentioned this issue Nov 23, 2021

Division of Features in Top-Level API #20

Closed

kmpaul added this to xWRF Dec 16, 2021

kmpaul added this to the Project xWRF milestone Dec 21, 2021

lpilz changed the title ~~Coordinates~~ Coordinate UX Jan 25, 2022

lpilz mentioned this issue Jan 25, 2022

[Bug]: XTIME coordine not being parsed #31

Closed

andersy005 added this to Xdev Project Board Jan 31, 2022

andersy005 moved this to Todo in Xdev Project Board Jan 31, 2022

andersy005 moved this from 🌳 Todo to ▶ In Progress in Xdev Project Board Jan 31, 2022

andersy005 mentioned this issue Feb 3, 2022

Add xwrf/wrf DataArray and Dataset Accessors #44

Merged

3 tasks

jthielen mentioned this issue Feb 14, 2022

First draft "destagger" function #37

Closed

3 tasks

jthielen closed this as completed in #14 Feb 16, 2022

jthielen moved this to Done in xWRF Feb 16, 2022

Repository owner moved this from ▶ In Progress to ✅ Done in Xdev Project Board Feb 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coordinate UX #11

Coordinate UX #11

lpilz commented Nov 15, 2021

jthielen commented Nov 20, 2021

lpilz commented Nov 23, 2021 •

edited

Loading

lpilz commented Jan 25, 2022

lpilz commented Jan 25, 2022

kmpaul commented Jan 26, 2022

jthielen commented Jan 26, 2022 •

edited

Loading

kmpaul commented Jan 26, 2022

lpilz commented Jan 27, 2022

kmpaul commented Jan 27, 2022

Coordinate UX #11

Coordinate UX #11

Comments

lpilz commented Nov 15, 2021

jthielen commented Nov 20, 2021

lpilz commented Nov 23, 2021 • edited Loading

lpilz commented Jan 25, 2022

lpilz commented Jan 25, 2022

kmpaul commented Jan 26, 2022

jthielen commented Jan 26, 2022 • edited Loading

kmpaul commented Jan 26, 2022

lpilz commented Jan 27, 2022

kmpaul commented Jan 27, 2022

lpilz commented Nov 23, 2021 •

edited

Loading

jthielen commented Jan 26, 2022 •

edited

Loading