Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send a weekly instance status report (resolves #1509) #1683

Merged
merged 11 commits into from
Sep 23, 2020
Merged

Conversation

Twixes
Copy link
Member

@Twixes Twixes commented Sep 18, 2020

Changes

This adds a weekly (sent to us each Monday) status report regarding each PostHog instance as a whole, including event ingestion metadata.

@timgl timgl temporarily deployed to posthog-ingestion-data-ipy2xb0 September 18, 2020 15:05 Inactive
@Twixes Twixes requested a review from paolodamico September 18, 2020 15:50
@Twixes Twixes changed the title Send a weekly instance status report (closes #1509) Send a weekly instance status report (resolves #1509) Sep 18, 2020
@Twixes Twixes linked an issue Sep 18, 2020 that may be closed by this pull request
@timgl timgl had a problem deploying to posthog-ingestion-data-gv54iwo September 18, 2020 20:36 Failure
@Twixes Twixes temporarily deployed to posthog-ingestion-data-gv54iwo September 18, 2020 20:42 Inactive
Copy link
Contributor

@paolodamico paolodamico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start! Added a few comments, but feel free to adjust what you think makes the most sense. We might also be better off shipping an initial version and iterating from there.

posthog/utils.py Outdated Show resolved Hide resolved
posthog/tasks/status_report.py Outdated Show resolved Hide resolved
@@ -48,6 +47,9 @@ def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
crontab(day_of_week="mon,fri"), update_event_partitions.s(), # check twice a week
)
sender.add_periodic_task(
10, status_report.s(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would send the report every 10 seconds, perhaps a cron schedule here? Also, what should we do with retries or if the server is down during the scheduled time (e.g. Heroku free dynos). Ideally we would also have a failsafe on PostHog side to prevent duplicate entries for the same week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above, we might have the edge case of a user having multiple instances or workers, so event duplication prevention might become more relevant (doing reports on a team-basis would also help here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, this is a leftover from testing, supposed to be crontab(day_of_week="mon"). :)

posthog/tasks/status_report.py Show resolved Hide resolved
@timgl timgl temporarily deployed to posthog-ingestion-data-zxakkor September 20, 2020 22:23 Inactive
@Twixes Twixes temporarily deployed to posthog-ingestion-data-zxakkor September 20, 2020 22:26 Inactive
@Twixes Twixes temporarily deployed to posthog-ingestion-data-zxakkor September 20, 2020 22:30 Inactive
@Twixes
Copy link
Member Author

Twixes commented Sep 20, 2020

A-ha, there is one odd issue: posthoganalytics.capture() doesn't want to send the data to PostHog, without raising an exception. Not sure why that's the case.

@timgl timgl temporarily deployed to posthog-ingestion-data-zxakkor September 21, 2020 05:14 Inactive
Copy link
Contributor

@paolodamico paolodamico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! As a final thing maybe we should add a test for the status_report() method?

posthog/tasks/status_report.py Show resolved Hide resolved
posthog/tasks/status_report.py Show resolved Hide resolved
posthog/tasks/status_report.py Outdated Show resolved Hide resolved
@Twixes Twixes temporarily deployed to posthog-ingestion-data-zxakkor September 21, 2020 13:45 Inactive
@Twixes Twixes temporarily deployed to posthog-ingestion-data-zxakkor September 22, 2020 15:26 Inactive
@Twixes Twixes temporarily deployed to posthog-ingestion-data-zxakkor September 23, 2020 01:31 Inactive
@Twixes
Copy link
Member Author

Twixes commented Sep 23, 2020

This now reports correctly to PostHog. Also added the count of persons active in the period.

Part of report

.

@Twixes Twixes requested a review from paolodamico September 23, 2020 01:33
@Twixes Twixes merged commit 8ae86c6 into master Sep 23, 2020
@Twixes Twixes deleted the ingestion-data branch September 23, 2020 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Capture metadata on ingested events @ self-hosted
3 participants