π An open-source workflow for harmonizing health facility data from multiple sources.
If you find this project valuable, star β this repository to support the work and help others discover it!
At Unpatterned AI Labs, we are building an open-source global unified data platform to provide high-quality, accessible, and harmonized datasets for researchers, policymakers, and organizations.
Health systems worldwide struggle with fragmented, inconsistent, and hard-to-use datasets, limiting their ability to plan, allocate resources, and improve healthcare delivery. While initiatives like WHOβs Geolocated Health Facilities Data (GHFD) project aim to close this gap, many countries still lack clean, standardized, and easily integrable datasets. We aim to change that.
We are curating and harmonizing health facility data from multiple sources, including:
- Healthsites.io
- Overture Maps
- OpenStreetMap (OSM)
- Humanitarian Data Exchange (HDX)
Rather than collecting new data, our focus is on aggregating, standardizing, and simplifying access to existing datasets using Python and R. Our libraries will enable developers, researchers, and organizations to seamlessly integrate health data into their workflows.
Despite the abundance of open health data, key challenges remain:
β Fragmented & Scattered Data β Health facility data exists across multiple platforms with no unified access point.
β Lack of Structured & Usable Data β Many datasets are difficult to integrate due to inconsistent formats (Excel, CSV, JSON).
β Data Accessibility Barriers β No standardized, developer-friendly tools for working with these datasets efficiently.
We are building an open-source Python & R libraries that:
β
Provide harmonized health facility data from multiple sources.
β
Enable seamless access for research, analytics, and AI applications.
β
Ensure data presence verification (not validation).
β
Support privacy-aware usage based on data-sharing policies.
By combining open data, AI-driven insights, and ethical governance, weβre making health data more accessible and actionable.
βοΈ Empowers researchers & policymakers with structured health data.
βοΈ Reduces fragmentation & improves accessibility of global health facility data.
βοΈ Encourages collaboration between health organizations and data providers.
βοΈ Sets the foundation for AI-driven insights & decision-making.
Weβd love for you to be a part of this! Hereβs how you can help:
Know of any useful health facility datasets? Drop a link in the issues section or create a PR! We are especially interested in global sources beyond the US, UK, and Nigeria.
- Fork this repository.
- Clone the repo and create a new branch.
- Make improvements, fix bugs, or add new datasets.
- Submit a pull request for review!
Have suggestions or feedback? Open an issue or start a discussionβletβs build this together!
πΉ Store Data in S3 or Cloud storage.
πΉ Expanding data coverage to more countries.
πΉ Developing python / R library and API access.
πΉ Exploring AI-based data enrichment.
πΉ Semantic Data Layers
πΉ Using integration with placeKeys