-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in process_consequences
that was introduced when adding support for VEP without polyphen
#710
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had just a couple of comments on the readability of this where we are focusing on the module now. I also realized process_consequences produces a lot of data since we keep the entire struct for all the worst...by...
s. Not for this PR but I feel we could add an option to trim this down so say worst_csq_by_gene only returns could be a dict of gene symbol and csq term. I'll make a ticket for the backlog is this is something you agree could be useful?
.when((tc.lof == "HC") & hl.or_else(tc.lof_flags == "", True), no_flag) | ||
.when((tc.lof == "HC") & (tc.lof_flags != ""), flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is unchanged from before but since no_flag and flag are only used once and you cant pass a value for flag, I think its more difficult to read with the variables, because you need to go back up to the assignment. I'd just do 500, 500 /(1 + penalize_flags). IF you agree with the value adjustment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just going to leave as is for now since my next PR will completely remove the use of the scores
flag = 500 | ||
no_flag = flag * (1 + penalize_flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I know this is from before but since were in the process of revamping this module, I think the language around and then how we handle the logic in relation to those flags is counterintuitive, i.e the penalize_flag logic is actually a no_flag booster. These values seem arbitrary and I cant imagine anyone is actually using these score values themselves since LOF is deducted so far down. It would be more clear if the scores were no_flag=500, flag = 500/(1+penalize_flags).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm actually completely removing the scores in my next PR
Co-authored-by: Mike Wilson <mwilson@broadinstitute.org>
Yeah, my next PR completely changes this function, but it will take more time to review, so just getting this in with minimal changes to fix the bug. These changes were my first round of changes to fix the function before I got completely confused by this and other functions in vep.py and completely rewrote it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! |
This adds the following fixes to the current
process_consequences
:modifier = _csq_score(tc)
since it's already handled by:.lof
and.lof_flags
is not handled correctly.lof
or.lof_flags
is missing,flag_condition
will be missing, and thereforecsq_score
will be missing.(tc.lof == "HC")
is False, thenflag_condition
evaluates to False, so theno_flag_score
will be subtracted from the modifier. This should only happen if(tc.lof == "HC")
is True.The updated function also has the following fixes to the original code (before the support for no polyphen was added):
csq_order
toadd_most_severe_consequence_to_consequence
(in the default case this wouldn't have caused an issue)Here are some tests showing the comparison of the original code, the current code, and the code in this PR
fixes_to_process_consequences_small.html.zip