Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge freq array function and new frequency dictionary builder #551

Merged
merged 24 commits into from
Jun 23, 2023

Conversation

mike-w-wilson
Copy link
Contributor

@mike-w-wilson mike-w-wilson commented Jun 8, 2023

This adds the ability to merge multiple frequency arrays that are on the same table. The merge supports addition or subtraction of arrays. For subtraction, all freq arrays are subtracted from the first freq array in the list. This also adds a new function to build frequency array dictionaries from the frequency metadata.

Fix incorrect formatting
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple suggestions

gnomad/utils/annotations.py Outdated Show resolved Hide resolved
gnomad/utils/annotations.py Outdated Show resolved Hide resolved
gnomad/utils/annotations.py Outdated Show resolved Hide resolved
gnomad/utils/annotations.py Show resolved Hide resolved
gnomad/utils/annotations.py Outdated Show resolved Hide resolved
gnomad/utils/annotations.py Outdated Show resolved Hide resolved
Comment on lines 1309 to 1318
AC=hl.case()
.when(set_negatives_to_zero, hl.max(x.AC, 0))
.or_error(negative_value_error_msg),
AN=hl.case()
.when(set_negatives_to_zero, hl.max(x.AN, 0))
.or_error(negative_value_error_msg),
homozygote_count=hl.case()
.when(set_negatives_to_zero, hl.max(x.homozygote_count, 0))
.or_error(negative_value_error_msg),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            **{
                ann: hl.case()
                       .when(set_negatives_to_zero, hl.max(x[ann], 0))
                       .or_error(negative_value_error_msg)
                for ann in callstat_ann
            }

gnomad/utils/annotations.py Show resolved Hide resolved
gnomad/utils/release.py Show resolved Hide resolved
gnomad/utils/release.py Show resolved Hide resolved
mike-w-wilson and others added 2 commits June 21, 2023 09:01
Co-authored-by: jkgoodrich <33063077+jkgoodrich@users.noreply.github.com>
@@ -1106,7 +1108,7 @@ def region_flag_expr(
:return: `region_flag` struct row annotation
"""
prob_flags_expr = (
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())} if non_par else {}
{"non_par": t.locus.in_x_nonpar() | t.locus.in_y_nonpar()} if non_par else {}
Copy link
Contributor Author

@mike-w-wilson mike-w-wilson Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to stop black from doing this...this format "correction" will not work with hail

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh really? I didn't realize hail doesn't like that. Maybe add # fmt: skip to the end of it?

@mike-w-wilson
Copy link
Contributor Author

Back to you @jkgoodrich !

@mike-w-wilson mike-w-wilson requested a review from jkgoodrich June 23, 2023 14:01
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more things

gnomad/utils/annotations.py Show resolved Hide resolved
@@ -1106,7 +1108,7 @@ def region_flag_expr(
:return: `region_flag` struct row annotation
"""
prob_flags_expr = (
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())} if non_par else {}
{"non_par": t.locus.in_x_nonpar() | t.locus.in_y_nonpar()} if non_par else {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh really? I didn't realize hail doesn't like that. Maybe add # fmt: skip to the end of it?

gnomad/utils/annotations.py Outdated Show resolved Hide resolved
:param fmeta: List of frequency metadata for arrays being merged.
:param operation: Merge operation to perform. Options are "sum" and "diff". If "diff" is passed, the first freq array in the list will have the other arrays subtracted from it.
:param set_negatives_to_zero: If True, set negative array values to 0 for AC, AN, AF, and homozygote_count. If False, raise a ValueError. Default is True.
:return: Tuple of merged frequency array and its freq index dictionary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the freq meta list no instead of freq index dictionary right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😶‍🌫️

gnomad/utils/release.py Outdated Show resolved Hide resolved
gnomad/utils/release.py Outdated Show resolved Hide resolved
gnomad/utils/release.py Show resolved Hide resolved
gnomad/utils/release.py Outdated Show resolved Hide resolved
@@ -1106,7 +1108,8 @@ def region_flag_expr(
:return: `region_flag` struct row annotation
"""
prob_flags_expr = (
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())} if non_par else {}
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())
} if non_par else {} # fmt: skip
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha the extra note made the line long enough that black wants parentheses there now...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh boy. I'm surprised it was even removing the parentheses, mine doesn't do that

@mike-w-wilson mike-w-wilson requested a review from jkgoodrich June 23, 2023 18:16
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -1106,7 +1108,8 @@ def region_flag_expr(
:return: `region_flag` struct row annotation
"""
prob_flags_expr = (
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())} if non_par else {}
{"non_par": (t.locus.in_x_nonpar() | t.locus.in_y_nonpar())
} if non_par else {} # fmt: skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh boy. I'm surprised it was even removing the parentheses, mine doesn't do that

@mike-w-wilson mike-w-wilson merged commit f9c2f0f into main Jun 23, 2023
@mike-w-wilson mike-w-wilson deleted the mw/merge_freq_arrays branch April 3, 2024 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants