Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add update_platform method to EchoData #434

Merged
merged 15 commits into from
Sep 27, 2021

Conversation

imranmaj
Copy link
Contributor

Part of #201

Adds an EchoData.update_platform method which can be used to update the EchoData.platform group with additional external platform data.

Partially adapted from ngkavin#2, ngkavin#3

@imranmaj imranmaj requested a review from leewujung August 26, 2021 19:00
@emiliom
Copy link
Collaborator

emiliom commented Aug 26, 2021

Great to see this PR, @imranmaj! What's the reason that for update_platform, "Only sonar_models of "EK60", "EK80", and "EA640" are currently supported"? Are there core, hard challenges with adding AZFP support?

@imranmaj
Copy link
Contributor Author

Great to see this PR, @imranmaj! What's the reason that for update_platform, "Only sonar_models of "EK60", "EK80", and "EA640" are currently supported"? Are there core, hard challenges with adding AZFP support?

Ah that's right, I should add AZFP too; I took most of this from the pre-class-redesign code which didn't consider AZFP.

# only take data during ping times
start_time, end_time = min(self.beam["ping_time"]), max(self.beam["ping_time"])

# saildrone specific workaround
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This customization to the one instance of saildrone data that we have may be fragile. While that file follows a community convention for "trajectory" data, there are details that could change w/o breaking the convention. For example, I believe the coordinate variable doesn't have to be named trajectory. Also, a future netcdf file could contain more than one trajectory (deployment), in which case extra_platform_data["trajectory"][0] would no longer be correct.

I'll take a crack at generalizing this block.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed into your PR the code that generalizes the handling of "trajectory" data encoded using the convention used by the Saildrone sample file

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2021

Codecov Report

Merging #434 (3af3f41) into dev (0a81a47) will increase coverage by 0.22%.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #434      +/-   ##
==========================================
+ Coverage   76.41%   76.64%   +0.22%     
==========================================
  Files          37        5      -32     
  Lines        3227      334    -2893     
==========================================
- Hits         2466      256    -2210     
+ Misses        761       78     -683     
Flag Coverage Δ
unittests 76.64% <91.66%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
echopype/echodata/echodata.py 78.81% <91.66%> (-8.34%) ⬇️
echopype/echodata/combine.py 70.68% <0.00%> (-1.64%) ⬇️
echopype/convert/__init__.py
echopype/convert/set_groups_base.py
echopype/convert/utils/ek_date_conversion.py
echopype/convert/set_groups_azfp.py
echopype/convert/set_groups_ek80.py
echopype/calibrate/calibrate_ek.py
echopype/calibrate/__init__.py
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a81a47...3af3f41. Read the comment docs.

@leewujung
Copy link
Member

@emiliom : seems like this operation is something we should record in the Provenance group also?

@emiliom
Copy link
Collaborator

emiliom commented Sep 3, 2021

seems like this operation is something we should record in the Provenance group also?

Great suggestion!! I totally agree.

@imranmaj
Copy link
Contributor Author

imranmaj commented Sep 3, 2021

@emiliom I was running into issues with trajectory in the saildrone data before so I added that small section to fix it, I can investigate whether it's necessary at all

@emiliom
Copy link
Collaborator

emiliom commented Sep 4, 2021

@emiliom I was running into issues with trajectory in the saildrone data before so I added that small section to fix it, I can investigate whether it's necessary at all

I think it is necessary. I'm finally getting around to adding the generalized code (done). Now I just need to run the tests.

@emiliom
Copy link
Collaborator

emiliom commented Sep 4, 2021

@imranmaj I'm running into an error with the ~= operator in requirements.txt, again. When doing conda create -c conda-forge -n echopype_dev --yes python=3.9 --file requirements.txt --file requirements-dev.txt, I get this error:

InvalidVersionSpec: Invalid version '~3.10': invalid operator

I think the glitch is different from what I experienced earlier, a couple of weeks ago, but I don't remember the details of that instance. For now, I just need to build a conda env to run the echopype tests, so I changed ~= to >=. But this is starting to sound like a persistent problem. If no one else is running into this type of problem, could there be something about my local setup that needs to be updated or changed?

@leewujung
Copy link
Member

Hmm... I did not run into the ~= problem while creating a new environment with just python=3.8 and echopype. I used conda-forge as usual. @emiliom is the env you are working with a brand new one?

@emiliom
Copy link
Collaborator

emiliom commented Sep 4, 2021

is the env you are working with a brand new one?

Yeah. I'm creating a new env

- `"longitude"`
- `"water_level"`

Only `sonar_model`s of `"EK60"`, `"EK80"`, `"EA640"`, and `"AZFP"` are currently supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imranmaj @emiliom : Could you please remind me why we don't allow this for AD2CP files? I seem to remember that we discussed this but can't recall what the conversation was.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks that's for you and @imranmaj to answer. I don't remember being in a conversation about it, and wouldn't know if there's anything specific about AD2CP files that creates a special issue. I assume there isn't, and that its absence was just inertia.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inertia! 😅

@imranmaj : do you see problems with adding AD2CP support in this? The potential "problem" I can see is that AD2CP files comes with pitch, roll and heading but not lat lon, so it is a partially filled dataset. So suppose users then want to add lat lon, the operations would become reading in the existing Platform group and add the new variables. Is this accurate?

Perhaps more generally the approach would be to:

  • check if there is duplications in variables in extra_platform_data with the existing ed.platform dataset, warn user of overwritting if that's what they select
  • add an option to allow user to specify if they want to overwrite or use existing variables
    • if allow overwriting then replace everything
  • add new variables not already exist in ed.platform

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, the need you've outlined is more or less identical to the situation for the OOI AZFP data. The platform group is created with some variables, but empty lat & lon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

@imranmaj
Copy link
Contributor Author

External data is being indexed by a dimension named "time2" now based on 9/16/21 discussion:

  • The sonar netCDF convention says that time dimensions in the Platform group should be named time1, time2, etc
  • Indexing using a new time dimension will prevent existing time dimensions (such as location_time) from being overwritten, which in turn will prevent other variables indexed by the previous values of those time dimensions from being invalidated

Comment on lines +351 to +353
# Handle data stored as a CF Trajectory Discrete Sampling Geometry
# http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#trajectory-data
# The Saildrone sample data file follows this convention
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiliom: reading this section you pointed out makes me wonder whether or not it is worthwhile to point out that the lat/lon data in the SONAR-netCDF4 ver.1 is consistent with the H.4.2. Single trajectory specification. Not that this necessitates any actual changes, but there seems to be values to make this explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think once we get around to put this function into the documentation, we should link this explicitly!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emiliom: reading this section you pointed out makes me wonder whether or not it is worthwhile to point out that the lat/lon data in the SONAR-netCDF4 ver.1 is consistent with the H.4.2. Single trajectory specification. Not that this necessitates any actual changes, but there seems to be values to make this explicit.

Late reply ... Interesting observation. I agree, the structure is consistent. Maybe the right place to add this observation is the SONAR-netCDF4 version 2 draft text. From what Gavin said, it doesn't look like there are changes in the Platform group in v.2.

Also, I think once we get around to put this function into the documentation, we should link this explicitly!

Definitely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consistency with the CF single trajectory specification was deliberate :) We tried to use CF conventions whenever it was sensible (and did attempt to meet CF-1.7, although it wasn't feasible for all of sonar-netcdf4 v1).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gavinmacaulay thanks for taking the time to chime in! I've gone ahead and created a new issue in the SONAR-netCDF4 repo. I'll try to handle it myself, hopefully next week.

@leewujung
Copy link
Member

Thanks for the changes @imranmaj ! We can now merge this 🙂

@leewujung leewujung merged commit f9e43f1 into OSOceanAcoustics:dev Sep 27, 2021
@leewujung leewujung added this to the 0.5.4 release milestone Sep 27, 2021
leewujung added a commit that referenced this pull request Oct 1, 2021
* re-attach sonar_model to combined echodata object (#437)

* Correct duplicate ping_times during EK60 conversion (fixes #235) (#433)

* Add fix for duplicate ping_time

* Remove dead code

* Increment by 1ns instead of 1ms

* Add duplicate ping time warning and store in provenance

* Autoformatter

* Add test for duplicate ping times file

* Add qc method to remove duplicate pings

* Drop entire pings with duplicate values

* Store entire original ping_time and add attribute when there are duplicate ping_times

* Remove duplicate ping_time qc method, it will be added back in future PR

* Change EchoData combine ping_time reversal attribute to netCDF encodable value

* Remove unused import

* Update tests

* Clarify duplicate ping_time removal and drop warnings

* Add removal note to deprecation warnings (#443)

* Add removal note to deprecation warnings

* Say removal will be specifically in next release

* AD2CP conversion bug fixes (#438)

* Allow strings in AD2CP config string data record

* Add ping_time dimension to ahrs variables

* Default AHRS to empty list instead of None

* Change AHRS dims when it does not exist

* Add update_platform method to EchoData (#434)

* Add update_platform method to EchoData

* Add test for update_platform

* Use drop_vars instead of drop

* Use .data on variables

* Generalize extra_platform_data time dimension name

* Add AZFP support

* Generalized update_platform handling of CF trajectory-encoded datasets

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Default water_level to zeros instead of ones

* Add warning for overwritten variables

* Index incoming data by time2

* Add long_name for time2

* Show dropped variable warning when nans do not exist

* Reword dropped variable warning

* Rename mapping_get_multiple to mapping_search_variable

Co-authored-by: Emilio Mayorga <emiliomayorga@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Add summary statistics (#444)

* initial add of echo metrics functions

* update echo metrics to use range as data variable

* small readability edit

* added test functions to echopype.tests

* Modified syntax for consistency

* lint corrections on metrics

* added docstrings

* changed summary_statistics.py docstrings to numpy format

* Clean up tabs in code

* Clean up tabs in test code

* add metrics to run-test.py

* renamed test_metrics_summary_statistics

* small debug edits

* pre commit black edits

* moved test_metrics_summary_statistics.py to /tests/metrics/

* docstring update, add typing, change inertia to dispersion

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: stuara <stuara@uw.edu>
Co-authored-by: Landung "Don" Setiawan <landung.setiawan@gmail.com>
Co-authored-by: stuara <83487347+stuara@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* Prepare workflows for master -> main (#445)

* Add PathNotFoundError in _load_convert for open_converted (#447)

* add PathNotFoundError in _load_convert for open_converted

Co-authored-by: imranmaj <49664304+imranmaj@users.noreply.github.com>
Co-authored-by: Emilio Mayorga <emiliomayorga@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: stuara <stuara@uw.edu>
Co-authored-by: Landung "Don" Setiawan <landung.setiawan@gmail.com>
Co-authored-by: stuara <83487347+stuara@users.noreply.github.com>
Co-authored-by: Landung "Don" Setiawan <landungs@uw.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants