Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_area_def to cf reader #1695

Merged
merged 18 commits into from
Sep 16, 2022

Conversation

BENR0
Copy link
Collaborator

@BENR0 BENR0 commented May 27, 2021

This adds add area definition support for the cf reader. I fixed the tests and added a test area in geos projection as well as some preliminary assertions for testing the area.
I think the tests should be improved and extended to cover other areas. Currently the extent assertion is failing because of pytroll/pyresample#355.

@BENR0 BENR0 requested review from djhoese and mraspaud as code owners May 27, 2021 14:44
@djhoese
Copy link
Member

djhoese commented Jun 4, 2021

Pyresample 1.20. 0 is being deployed right now. It looks like our test environment uses pyresample from conda-forge so we'll have to wait for that package to be pull requested, merged, released, and synced to the package repository.

@mraspaud
Copy link
Member

mraspaud commented Jun 4, 2021

This is failing, so it'll have to go to the next release

@djhoese
Copy link
Member

djhoese commented Jun 5, 2021

@zxdawn @BENR0 @mraspaud So I just ran this locally with pyresample main and it does actually fail even though the CI jobs here aren't pulling in the right version:

>           assert expected_area.area_extent == actual_area.area_extent
E           AssertionError: assert (339045.5577,... 4803645.4685) == (4803645.4685..., 339045.5577)
E             At index 0 diff: 339045.5577 != 4803645.468500001
E             Use -v to get the full diff

satpy/tests/reader_tests/test_satpy_cf_nc.py:144: AssertionError

Edit: Note this is way better than the results from the old pyresample version:

 >           assert expected_area.area_extent == actual_area.area_extent
E           AssertionError: assert (339045.5577,... 4803645.4685) == (171902444919...3027029152.95)
E             At index 0 diff: 339045.5577 != 171902444919656.84
E             Use -v to get the full diff

@BENR0
Copy link
Collaborator Author

BENR0 commented May 13, 2022

It seems this was due to a switch of x and y in the test setup. Now it passes. One caveat though is that I had to add pytest.approx to the assert statement because the upper right y coordinate was just a tiny bit off from the expected value.

@djhoese djhoese added enhancement code enhancements, features, improvements component:readers labels May 13, 2022
@djhoese
Copy link
Member

djhoese commented May 13, 2022

You may need to merge main into this branch. It seems github actions doesn't like the old workflow file (I hope that's the problem at least).

@BENR0
Copy link
Collaborator Author

BENR0 commented May 13, 2022

Indeed I totally forgot to do that.

@codecov
Copy link

codecov bot commented May 13, 2022

Codecov Report

Merging #1695 (2b2048b) into main (ff54a74) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1695      +/-   ##
==========================================
+ Coverage   94.05%   94.13%   +0.08%     
==========================================
  Files         290      293       +3     
  Lines       44639    45079     +440     
==========================================
+ Hits        41987    42437     +450     
+ Misses       2652     2642      -10     
Flag Coverage Δ
behaviourtests 4.68% <0.00%> (-0.04%) ⬇️
unittests 94.79% <100.00%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/readers/satpy_cf_nc.py 97.97% <100.00%> (+0.17%) ⬆️
satpy/tests/reader_tests/test_satpy_cf_nc.py 100.00% <100.00%> (ø)
satpy/writers/geotiff.py 93.75% <0.00%> (ø)
satpy/tests/test_dataset.py 100.00% <0.00%> (ø)
satpy/readers/seviri_l1b_native_hdr.py 100.00% <0.00%> (ø)
satpy/tests/reader_tests/test_seviri_l1b_native.py 100.00% <0.00%> (ø)
satpy/readers/mws_l1b.py 98.51% <0.00%> (ø)
satpy/tests/reader_tests/test_mws_l1b_nc.py 100.00% <0.00%> (ø)
satpy/readers/pmw_channels_definitions.py 97.60% <0.00%> (ø)
satpy/tests/test_writers.py 98.99% <0.00%> (+0.02%) ⬆️
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@coveralls
Copy link

coveralls commented May 13, 2022

Coverage Status

Coverage increased (+0.07%) to 94.74% when pulling 2b2048b on BENR0:feat_add_get_area_def_to_cf_reader into ff54a74 on pytroll:main.

@BENR0
Copy link
Collaborator Author

BENR0 commented May 24, 2022

@djhoese @mraspaud did you have time to look at this?

@mraspaud
Copy link
Member

@BENR0 sorry for the late reply.

It looks good for the area case, but what happens if we are working with swath definitions (so we just have lons and lats)? will the reading crash then?

@BENR0
Copy link
Collaborator Author

BENR0 commented May 24, 2022

I think that depends if this is covered by the implementations in pyresample. I can have a look at it even though I am not that familiar with the code. Maybe @TomLav can help because most of the code is from him.

@mraspaud
Copy link
Member

@BENR0 I just meant to verify that the changes you make do not change the behaviour of the lon/lat case, no need to implement it here. So if you can see that the lon/lat case still works as before, then this is good as it is now

@BENR0
Copy link
Collaborator Author

BENR0 commented May 24, 2022

@mraspaud do I understand correctly that you mean if a netcdf written from a Satpy Scene which only has a swath definition is also read correctly?

@mraspaud
Copy link
Member

I haven't checked recently, but I would expect so.

@mraspaud
Copy link
Member

Although it might be the same problem as you are fixing here that the geolocation isn't actually read.

@TomLav
Copy link

TomLav commented May 24, 2022

Thanks for tagging me. I am maybe wondering if cf_reader should offer a richer interface to pyresample's from_cf() routine? For example, from_cf() offers the option to request the area_def attached to a specific netCDF variable which is not available from the cf_reader() at present.
To specify variable= in from_cf() can useful both if there are several area_defs in a netCDF file (not illegal, but not often) and if the CF encoding is not fully standard (in which case specifying variable= helps from_cf() to locate the area_def).
But I do not know if the cf_reader workflow only looks at file-level area_defs or variable-level.

@BENR0
Copy link
Collaborator Author

BENR0 commented May 24, 2022

Indeed I think that would be something to think about to make it more generic in the future. Currently I think the reader is mainly for reading files written with Satpy which I think does not produce different area_defs in one file. So my opinion is to make this another PR (respectively create an feature issue for it).

@TomLav
Copy link

TomLav commented May 24, 2022

Thanks. Your approach is fine with me.

@BENR0
Copy link
Collaborator Author

BENR0 commented May 25, 2022

@mraspaud regarding the swath definition case: You were right implementing the get_area_def the way I did breaks this case. The reason is that the CF utils in pyresample implemented by @TomLav do not cover that case. Without the get_area_def method implemented in the specific reader Satpy "falls back" to creating the area from the lat/lon grids which is implemented in the FileYAMLReader. For a quick work around I added ValueError to the try except in order to fall back to the yaml reader creating the area def. While this works I think this is basically asking for trouble since ValueError is pretty generic and might lead to more problems that it solves.

My opinion about this is that handling 2D lat/lon coordinate grids should also be implemented in the CF utils in pyresample because CF allows this kind of coordinates and therefore AreaDefinition.from_cf should be able to handle this.
Apart from this obviously there is a unit test missing in the cf reader for the 2D lat/lon grid case. Are there some helper functions somewhere already to create a Satpy dataset with a swath definition which could be reused?

@TomLav
Copy link

TomLav commented May 25, 2022

Hello. I think it is correct of pyresample's from_cf() to not support lat/lon swath data (if this is what was failing now). I think AreaDefinition is only for grids based on Earth projections, while the geometry information of lat/lon swath would use SwathDefinition object.

I don't know satpy. Can you link me to the current implementation of this feature in satpy, so that I can see how it is working?

@mraspaud
Copy link
Member

sounds good, feel free to refactor!

@BENR0
Copy link
Collaborator Author

BENR0 commented Aug 29, 2022

@mraspaud I refactored the dataset generation to be in the set up for all tests. I left in the writing of the header attribute instrument because #2176 is not merged yet and thus tests would fail.

@djhoese djhoese mentioned this pull request Aug 29, 2022
2 tasks
@mraspaud
Copy link
Member

#2176 is now merged @BENR0

@BENR0
Copy link
Collaborator Author

BENR0 commented Aug 29, 2022

@mraspaud ok will update this pr later today or tomorrow morning.

@BENR0
Copy link
Collaborator Author

BENR0 commented Aug 30, 2022

@mraspaud I merged main and refactored out the instrument lines in the tests.

@mraspaud
Copy link
Member

@BENR0 regarding the failing tests on windows, I can recommend the usage of the tmp_path fixture, maybe that can help.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djhoese
Copy link
Member

djhoese commented Sep 6, 2022

@BENR0 Any time to look into the failing tests?

@BENR0
Copy link
Collaborator Author

BENR0 commented Sep 12, 2022

@djhoese yes I looked at it and it seems that the failing test on windows is due to the same filename being used in two tests. I don't know if @mraspaud hint with the tmp_path fixture can fix that. But it seems that in later tests file pathes are just counted up to make them different. I can just use that same technique here if that is ok with you.

@mraspaud
Copy link
Member

Yes the fixture seems to work better on windows than the tempdir functions

@djhoese
Copy link
Member

djhoese commented Sep 12, 2022

@BENR0 if a single test function is writing multiple files then counting and produce unique filenames seems reasonable. Although this is probably a sign that the function should be split into multiple tests.

Using tmp_path as the destination for the files you create in the tests should mean that each test gets its own unique directory and so one test shouldn't interfere with another.

@BENR0
Copy link
Collaborator Author

BENR0 commented Sep 14, 2022

@djhoese @mraspaud since the tests are using unittest.testcase the pytest tmp_path fixture does not work. I actually think it would be nice to use it but that would need some refactoring. Haven't used the tmp_path fixture before so could be I am missing something and there is a workaround somehow.

Maybe I will have some time next week to work on this. If this should move forward faster I could just do the hacky filename numbering in the mean time.

@djhoese
Copy link
Member

djhoese commented Sep 14, 2022

Give me 5 seconds. I'll fix it.

@djhoese
Copy link
Member

djhoese commented Sep 14, 2022

@BENR0 I've made a PR to your fork that should do the pytest and tmp_path changes. It cleans up the code a ton:

BENR0#1

@djhoese
Copy link
Member

djhoese commented Sep 14, 2022

Note I assume that if the filenames are in a separate tmp directory for each test that the filenames can be the same for each test. If that means the tests are no longer testing what they were before then we can make multiple fixtures for each filename format.

@BENR0
Copy link
Collaborator Author

BENR0 commented Sep 14, 2022

:-D after I wrote my comment I started the same changes you made and tested it for the first test. But didn't have the time to finish it. Thanks for the work.

@djhoese
Copy link
Member

djhoese commented Sep 14, 2022

Yeah, no problem. I'm avoiding my other work so this was a nice distraction and I've pytest'd quite a few test modules at this point. All you have to do now is merge the PR in your fork and the tests here should pass...I hope.

@mraspaud
Copy link
Member

@djhoese @BENR0 nice with the switch to pytest!

Rewrite CF reader tests to use pytest
@djhoese
Copy link
Member

djhoese commented Sep 16, 2022

The unstable test failure seems to be a problem in rasterio...I hope. I'm going to try restarting the test and see what happens. If it fails again I think I'll merge this anyway and then work on figuring out what broke in a separate PR.

@djhoese
Copy link
Member

djhoese commented Sep 16, 2022

Filed rasterio/rasterio#2591 with rasterio. Let's merge this and worry about the failing test somewhere else.

@djhoese djhoese merged commit df66bce into pytroll:main Sep 16, 2022
@BENR0 BENR0 deleted the feat_add_get_area_def_to_cf_reader branch October 10, 2022 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:readers enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add AreaDefinition support to the 'satpy_cf_nc' reader
5 participants