Provider

🧰 What does this project do?

The Tuva Provider project combines and transforms messy public provider datasets into usable data. This project contains the transformations we use to create the clean datasets for users of the Tuva Project. We have made this project public to share our methodology and code.

You can easily load the cleaned provider data into your data warehouse by using the terminology seeds from The Tuva Project package.

Source data dependencies:

Data Set	Updated by Source	Source
NPPES Data Dissemination	Monthly	https://download.cms.gov/nppes/NPI_Files.html
NUCC Health Care Provider Taxonomy	Semi-annually (January and July)	https://nucc.org/index.php/code-sets-mainmenu-41/provider-taxonomy-mainmenu-40/csv-mainmenu-57
CMS Medicare Provider and Supplier Taxonomy Crosswalk	Annually	https://data.cms.gov/provider-characteristics/medicare-provider-supplier-enrollment/medicare-provider-and-supplier-taxonomy-crosswalk

🔌 Database Support

Snowflake

✅ How to get started

Pre-requisites

You have dbt installed and configured (i.e. connected to your data warehouse). If you have not installed dbt, here are instructions for doing so.
You have created a database for the output of this project to be written in your data warehouse.
You have downloaded the source data and loaded it into a staging table your data warehouse.
- NPPES NPI Data (Note: source data comes zipped with many files, only the "npidata_pfile....csv" is required.)
- NUCC Health Care Provider Taxonomy
- CMS Medicare Provider and Supplier Taxonomy Crosswalk

Getting Started

Complete the following steps to configure the project to run in your environment.

Clone this repo to your local machine or environment.
Update the dbt_project.yml file:
1. Add the dbt profile connected to your data warehouse.
2. Update the variable provider_database to use the new database you created for this project, default is "nppes"..
Update the models/_sources.yml file:
1. Update the database where your source data has been loaded, default is "nppes".
2. Update the schema where your source data has been loaded, default is "raw_data".
3. If the source tables are named differently then you can add the table identifier property.
Run dbt build.

For Tuva Terminology seeds, we export this data as CSV and then load it to the Tuva Public Resources bucket in Amazon S3. Here are some SQL examples for exporting the data from Snowflake:

Standard Provider seed export:

copy into --YOUR_S3_URL.../provider.csv
  from NPPES.CLAIMS_DATA_MODEL.PROVIDER
file_format =  (type = csv field_optionally_enclosed_by = '"')
storage_integration = --YOUR_INTEGRATION
overwrite = true;

Compressed Provider seed export:

copy into --YOUR_S3_URL.../provider_compressed.csv.gz
  from NPPES.CLAIMS_DATA_MODEL.PROVIDER
file_format = (
    type = csv
    field_optionally_enclosed_by = '"'
    compression = gzip
)
header = true
max_file_size = 4900000000
overwrite = true
single = true
storage_integration = --YOUR_INTEGRATION

Standard Other Provider Taxonomy seed export:

copy into --YOUR_S3_URL.../other_provider_taxonomy.csv
  from NPPES.CLAIMS_DATA_MODEL.OTHER_PROVIDER_TAXONOMY
file_format =  (type = csv field_optionally_enclosed_by = '"')
storage_integration = --YOUR_INTEGRATION
overwrite = true;

🙋🏻‍♀️ How is this project maintained and can I contribute?

Project Maintenance

The Tuva Project team maintaining this project only maintains the latest version of the project. We highly recommend you stay consistent with the latest version.

Contributions

Have an opinion on the mappings? Notice any bugs when installing and running the project? If so, we highly encourage and welcome feedback! While we work on a formal process in Github, we can be easily reached on our Slack community.

🤝 Community

Join our growing community of healthcare data practitioners on Slack!

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
analyses		analyses
macros		macros
models		models
seeds		seeds
snapshots		snapshots
tests		tests
.gitignore		.gitignore
README.md		README.md
dbt_project.yml		dbt_project.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Provider

🧰 What does this project do?

🔌 Database Support

✅ How to get started

Pre-requisites

Getting Started

🙋🏻‍♀️ How is this project maintained and can I contribute?

Project Maintenance

Contributions

🤝 Community

About

Releases

Packages

Contributors 5

tuva-health/provider

Folders and files

Latest commit

History

Repository files navigation

Provider

🧰 What does this project do?

🔌 Database Support

✅ How to get started

Pre-requisites

Getting Started

🙋🏻‍♀️ How is this project maintained and can I contribute?

Project Maintenance

Contributions

🤝 Community

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Packages