-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify default_compute_info
with the option to use the AS_
annotations in gvcf_info for allele specific aggregations
#560
Conversation
… into jg/change_compute_info_to_use_AS_ann # Conflicts: # gnomad/utils/annotations.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple questions for my knowledge and a few suggestions/questions
gnomad/utils/sparse_mt.py
Outdated
AS_INFO_SUM_AGG_FIELDS = ["AS_QUALapprox", "AS_RAW_MQ"] | ||
AS_INFO_INT32_SUM_AGG_FIELDS = ["AS_VarDP"] | ||
AS_INFO_MEDIAN_AGG_FIELDS = ["AS_RAW_ReadPosRankSum", "AS_RAW_MQRankSum"] | ||
AS_INFO_ARRAY_SUM_AGG_FIELDS = ["AS_SB_TABLE"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an idea, should we nest these in a dict called AS_INFO_FIELDS or something -- maybe a better name than that? I'm thinking of the utility of this module and using tab when writing code to see available variables and functions within this module. Theres just a lot of high level objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I change the non AS ones to that also?
# decimal place followed by a comma and the corresponding count for | ||
# that value, so we want to sum the rank sum value (first element). | ||
# Rename annotation in the form 'AS_RAW_*_RankSum' to 'AS_*_RankSum'. | ||
if k.startswith("AS_RAW_") and k.endswith("RankSum"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on making a drop_raw
parameter that accepts a list of fields? I know this bit of code is really specific to these fields so no strong preference here and I dont even know if any other fields would ever apply given the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I want to keep it as is since there is a lot in this code that is very specific and that is the way Laura defines those annotations here: https://docs.google.com/document/d/1ZdRERavtO1k4v0ppVzJ7gu6PHzSydznhYDmvZUb8qPo/edit#heading=h.gyrw4ywmz8b3
if "AS_SB_TABLE" in info or "AS_SB" in info: | ||
# Rename AS_SB to AS_SB_TABLE if present and add SB Ax2 aggregation logic. | ||
if "AS_SB" in agg_expr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this will happen but do we always want to overwrite AS_SB_TABLE with AS_SB when both are present? Should we have logic that throws a warning if this happens? Or is that the intended use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, good point, I don't think it will happen, but I added a warning
Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
….com/broadinstitute/gnomad_methods into jg/change_compute_info_to_use_AS_ann
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No description provided.