-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto filter test accounts #3492
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look at the general python code & front-end, did some adjustments and otherwise lgtm. It would be great if someone with strong CH expertise take a look at this (maybe @macobo ?)
…g into auto-filter-test-accounts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some notes, mostly focusing on the BE code. I'm confident this works but code needs a bit of :love:
@@ -21,6 +21,7 @@ def parse_prop_clauses( | |||
prepend: str = "global", | |||
table_name: str = "", | |||
allow_denormalized_props: bool = False, | |||
filter_test_accounts=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this introduces some cognitive debt. Basically it's really hard to figure out when this argument should be true vs false.
I think the right level to solve this at is the filter
object.
All insight filters should overload def properties
to do this same extend as needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm not sure I follow. Are you saying I should just change the name of the property to something else (should_filter_test_accounts
maybe?). I don't find it too confusing, if filter_test_accounts=True you'd expect it to filter out test accounts and vice versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it's not obvious when this argument should be passed vs when not. It results in one additional thing to consider in every codepath.
Given we know whether to filter_test_accounts
or not in the filter object, a better solution (which hides this toggle from 90% of code) is to do it in the filter object in def properties()
as proposed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yeah I think you are right, I though there were a few paths were we explicitly didn't want to set filter_test_accounts even if the main filter does have it.
I don't think I understand your solution, do you mind elaborating? What do you mean with overload def properties()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
So a call to parse_prop_clauses
looks something like this:
query, params = parse_prop_clauses(filter.properties, ...)
The idea is to make filter.properties
dependent on the filter_test_accounts parameter. This can be done something like this in FilterTestAccountsMixin:
@cached_property
def properties(self) -> List[Property]:
properties = super()
if self.filter_test_accounts:
properties.extend(...)
return properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that feels pretty messy, as you won't know what you are passing through. Should we just pass through the filter object separately and then check if filter_test_accounts is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work as well, though not sure whether all callsites have access to a filter object. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we come full circle that the original solution is the best solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood your proposal to be to pass filter
object to parse_prop_clauses.
But we're going very long back-and-forth on this. If you're sold on the current approach, then let's go with that and refactor later as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sold on it but I also don't think the other proposals are much better. I think fine to leave for now, there's good tests so refactoring should be easy.
frontend/src/scenes/insights/InsightTabs/FunnelTab/FunnelTab.tsx
Outdated
Show resolved
Hide resolved
@@ -14,7 +16,28 @@ | |||
|
|||
|
|||
class TeamManager(models.Manager): | |||
def set_test_account_filters(self, organization: Optional[Any]) -> List: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find any tests for this.
Suggestion: extract this to a module together with GenericEmail instead of putting it on the model + add tests to the module. It keeps utils.py containing utilities + keeps the model file focused, it's becoming a big ball of sphaghetti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are here. Not sure I agree extracting this makes sense, the only time you use this is when you create a Team so I'd expect to find it in the team file.
@@ -631,8 +631,10 @@ class GenericEmails: | |||
|
|||
def __init__(self): | |||
with open(get_absolute_path("helpers/generic_emails.txt"), "r") as f: | |||
dd = [x.rstrip() for x in f] | |||
self.emails = dd | |||
self.emails = {x.rstrip(): True for x in f} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nitpicking since performance is basically the same, but considering the intentions set
(with set comprehension: {domain.rstrip() for domain in f}
would be a bit clearer.
Changes
Adds a little checkbox to each query (default to true but won't apply to any historic queries or dashboards) that allows you to easily filter out test accounts and team members.
The migration (and whenever you create a new team) will automatically add the following filters:
It'll detect non-generic email addresses and filter those out, plus a few localhost ones (I'm a bit more doubtful about that.)
Checklist