Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify default_compute_info with the option to use the AS_ annotations in gvcf_info for allele specific aggregations #560

Merged
merged 10 commits into from
Jul 27, 2023

Conversation

jkgoodrich
Copy link
Contributor

No description provided.

Copy link
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple questions for my knowledge and a few suggestions/questions

Comment on lines 30 to 33
AS_INFO_SUM_AGG_FIELDS = ["AS_QUALapprox", "AS_RAW_MQ"]
AS_INFO_INT32_SUM_AGG_FIELDS = ["AS_VarDP"]
AS_INFO_MEDIAN_AGG_FIELDS = ["AS_RAW_ReadPosRankSum", "AS_RAW_MQRankSum"]
AS_INFO_ARRAY_SUM_AGG_FIELDS = ["AS_SB_TABLE"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea, should we nest these in a dict called AS_INFO_FIELDS or something -- maybe a better name than that? I'm thinking of the utility of this module and using tab when writing code to see available variables and functions within this module. Theres just a lot of high level objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I change the non AS ones to that also?

gnomad/utils/sparse_mt.py Outdated Show resolved Hide resolved
gnomad/utils/sparse_mt.py Show resolved Hide resolved
gnomad/utils/sparse_mt.py Outdated Show resolved Hide resolved
gnomad/utils/sparse_mt.py Outdated Show resolved Hide resolved
# decimal place followed by a comma and the corresponding count for
# that value, so we want to sum the rank sum value (first element).
# Rename annotation in the form 'AS_RAW_*_RankSum' to 'AS_*_RankSum'.
if k.startswith("AS_RAW_") and k.endswith("RankSum"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on making a drop_raw parameter that accepts a list of fields? I know this bit of code is really specific to these fields so no strong preference here and I dont even know if any other fields would ever apply given the above comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I want to keep it as is since there is a lot in this code that is very specific and that is the way Laura defines those annotations here: https://docs.google.com/document/d/1ZdRERavtO1k4v0ppVzJ7gu6PHzSydznhYDmvZUb8qPo/edit#heading=h.gyrw4ywmz8b3

gnomad/utils/sparse_mt.py Show resolved Hide resolved
gnomad/utils/sparse_mt.py Show resolved Hide resolved
Comment on lines +423 to +425
if "AS_SB_TABLE" in info or "AS_SB" in info:
# Rename AS_SB to AS_SB_TABLE if present and add SB Ax2 aggregation logic.
if "AS_SB" in agg_expr:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this will happen but do we always want to overwrite AS_SB_TABLE with AS_SB when both are present? Should we have logic that throws a warning if this happens? Or is that the intended use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, good point, I don't think it will happen, but I added a warning

gnomad/utils/sparse_mt.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mike-w-wilson mike-w-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jkgoodrich jkgoodrich merged commit 24c069e into main Jul 27, 2023
@jkgoodrich jkgoodrich deleted the jg/change_compute_info_to_use_AS_ann branch July 27, 2023 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants