Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Simplify GBD metadata with Jinja macros #3917

Merged
merged 11 commits into from
Feb 14, 2025
Merged

✨ Simplify GBD metadata with Jinja macros #3917

merged 11 commits into from
Feb 14, 2025

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Feb 2, 2025

Motivation

GBD metadata is currently non-manageable. Even simple fix such as adding exception to cause ballooned into this monster. We need a way how to DRY it.

Solution

Add support for shared.meta.yml where user can add definitions: or macros: and use them across other YAML files.

Use macros instead of raw jinja. Macros are very similar to python functions and can be parametrized, which allowed me to DRY a lot of duplicated code. I used o1 and o3 heavily when refactoring & DRYing those functions. There's certainly more space for a cleanup, but I didn't have enough time to polish it.

Alternatives

At first, I thought that registering custom python functions would be the best solution, but it was overly complex and didn't really solve the problem. One of the issues was that python functions would have to be defined in the grapher step, while YAML metadata would live in a garden step.

GBD metadata

Besides DRYing metadata, I fixed a couple of whitespace issues.

@owidbot
Copy link
Contributor

owidbot commented Feb 2, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-jinja-macros

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found
Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2025-02-02 08:18:37 UTC
Execution time: 16.13 seconds

@Marigold Marigold marked this pull request as ready for review February 2, 2025 15:57
@Marigold Marigold requested a review from spoonerf February 2, 2025 15:57
@lucasrodes
Copy link
Member

we should make sure that this doesn't lead to the behavior here.

@lucasrodes
Copy link
Member

lucasrodes commented Feb 6, 2025

I just realized in this data page that there is a mysterious whitespace:

The estimated number of current cases of HIV/AIDS⎵, per 100 people.

image

I am not sure what is the source for this; I tried looking into etl/steps/data/garden/ihme_gbd/2024-05-20/gbd_prevalence.meta.yml but couldn't pinpoint a particular line.

Unsure if related, but when using ifs, I generally try to avoid unexpected linebreaks with the structure:

<% if ... %>
...
<%- elif ... %>
...
<%- else %>
...
<%- endif %>

@spoonerf spoonerf mentioned this pull request Feb 13, 2025
Copy link
Contributor

@spoonerf spoonerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome, thanks Mojmir! There is a still a whitespace issue with HIV/AIDs that Lucas noticed, which I've had a go at getting rid of in a separate PR. but I think it requires a sacrifice to the Jinja gods 😅

Strangely it looks okay in the playground notebook, but will have more of a play tomorrow.

@Marigold Marigold merged commit be655c2 into master Feb 14, 2025
9 checks passed
@Marigold Marigold deleted the jinja-macros branch February 14, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants