Skip to content

This project focuses on analyzing COVID-19 mortality data in Chile from 2020 to 2024. With at least 60,000 records, the dataset provides critical insights into the impact of the pandemic on the population.

Notifications You must be signed in to change notification settings

CamilaJaviera91/Chile-COVID-Mortality-2020-2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🦠 Chile COVID-19 Mortality Analysis (2020-2024)

πŸ“ƒ Project Description

  • Objective:

    • Analyze COVID-19 mortality data in Chile from 2020 to 2024, providing critical insights into the pandemic's impact on the population.
  • Project Division:

    • First Part: Work with Google Sheets for initial data exploration and analysis.
    • Second Part: Use PostgreSQL for advanced analysis and data storage.
  • Methodology:

    • Google Sheets:

      • Use Python for data processing and analysis.
      • Collaboratively explore data using Google Sheets.
    • PostgreSQL:

      • Migrate data to a PostgreSQL database for more detailed analysis and advanced queries.
  • Key Objectives:

    • Data Cleaning and Organization: Prepare the mortality dataset for meaningful analysis.
    • Exploration of Trends and Patterns: Investigate mortality rates by region, age, and time.
    • Data Visualization: Create effective visualizations to communicate key findings.

πŸ“‹ Google Sheetsgit chec

The project leverages Python for data processing and analysis and Google Sheets for collaborative data exploration.

🎯 Key objectives

  • Cleaning and organizing the mortality dataset for meaningful analysis.
  • Exploring trends and patterns, such as mortality rates by region, age, or time.
  • Visualizing data to communicate insights effectively.
  • Building reproducible workflows for data analysis and reporting.

πŸ—„οΈ Data Dictionary

NΒΊ Variable Name Description Data Type Values
1 AΓ‘O Year of the death date Number N/A
2 FECHA_DEF Date of the death date Date N/A
3 SEXO Gloss identifying biological sex Number 1: Male, 2: Female, 9: Other
4 EDAD_TIPO Unit of age measurement Number 1: Years, 2: Months, 3: Days
5 EDAD_CANT Numerical record of the patient's age at admission Number N/A
6 COD_COMUNA Code of the commune of residence of the deceased, according to the political-administrative division updated in 2019 Number N/A
7 COMUNA Gloss of the commune of residence of the deceased, according to the political-administrative division updated in 2019 Text N/A
8 NOMBRE_REGION Gloss of the region of residence Text N/A
9 DIAG1 Basic cause of death Text N/A
10 CAPITULO_DIAG1 ICD-10 chapter according to the cause of death Text N/A
11 GLOSA_CAPITULO_DIAG1 Gloss of the ICD-10 chapter according to the cause of death Text N/A
12 CODIGO_GRUPO_DIAG1 Group code of the cause of death according to ICD-10 Text N/A
13 GLOSA_GRUPO_DIAG1 Gloss of the group code of the cause of death according to ICD-10 Text N/A
14 CODIGO_CATEGORIA_DIAG1 Category code of the cause of death according to ICD-10 Text N/A
15 GLOSA_CATEGORIA_DIAG1 Gloss code of the cause of death according to ICD-10 Text N/A
16 CODIGO_SUBCATEGORIA_DIAG1 Category code of the cause of death according to ICD-10 Text N/A
17 GLOSA_SUBCATEGORIA_DIAG1 Gloss of the category code of the cause of death according to ICD-10 Text N/A
18 DIAG2 External cause of death Text N/A
19 CAPITULO_DIAG2 ICD-10 chapter according to the cause of death Text N/A
20 GLOSA_CAPITULO_DIAG2 Gloss of the ICD-10 chapter according to the cause of death Text N/A
21 CODIGO_GRUPO_DIAG2 Group code of the cause of death according to ICD-10 Text N/A
22 GLOSA_GRUPO_DIAG2 Gloss of the group code of the cause of death according to ICD-10 Text N/A
23 CODIGO_CATEGORIA_DIAG2 Category code of the cause of death according to ICD-10 Text N/A
24 GLOSA_CATEGORIA_DIAG2 Gloss of the category code of the cause of death according to ICD-10 Text N/A
25 CODIGO_SUBCATEGORIA_DIAG2 Category code of the cause of death according to ICD-10 Text N/A
26 GLOSA_SUBCATEGORIA_DIAG2 Gloss of the category code of the cause of death according to ICD-10 Text N/A
27 LUGAR_DEFUNCION Describes the place where the death occurs Text N/A

βœ¨πŸ—„οΈβœ¨ Clean Data Dictionary

NΒΊ Variable Name Description Data Type Values
1 AΓ‘O Year of the death date Number N/A
2 FECHA_DEF Date of the death date Date N/A
3 SEXO_NOMBRE Gloss identifying biological sex Number N/A
4 EDAD_CANT Numerical record of the patient's age at admission Number N/A
5 COMUNA Gloss of the commune of residence of the deceased, according to the political-administrative division updated in 2019 Text N/A
6 NOMBRE_REGION Gloss of the region of residence Text N/A
7 LUGAR_DEFUNCION Describes the place where the death occurs Text N/A
8 PERIODO Period of the death date Date N/A
9 RANGO_ETARIO Age range of the deceased Text N/A

πŸ“Š Charts

Month Deaths By Gender

charts1

Yearly Deaths By Region

charts2

Yearly Deaths By Age Range

charts3

πŸ“ Download PDF

Month Deaths By Gender

download1

Yearly Deaths By Region

download2

Yearly Deaths By Age Range

download3

πŸ”— Important Links


πŸ—„οΈ Postgres

The project leverages Python for data processing and analysis and Postgres for collaborative data exploration.

πŸš€ Installation & Setup

Prerequisites

Ensure you have the following installed:

  • PostgreSQL (Download)
  • Python 3.x (Download)
  • DBeaver (optional, for database visualization: Download)
  • Required Python Libraries:
  pip install pandas scikit-learn psycopg2 matplotlib seaborn numpy reportlab fpdf googleapiclient google

πŸ—„οΈ Database Schema

Connection

  1. In this case, we are going to use DBeaver, so the first thing we need to do is create a new connection.
  • As shown in the picture, we need to configure the following settings:
    • Host: localhost
    • Database: postgres
    • Port: 5432
    • Username: postgres
    • Password: YOUR PASSWORD

connection1

  1. After setting the parameters, we should test the connection to ensure it works before applying the configuration.

connection2

πŸ“Œ Schema Diagram

connection2

covid_data

- AΓ‘O (INT) - Year of the death date | Number
- FECHA_DEF (DATE) - Date of the death date 
- SEXO_NOMBRE (VARCHAR(10)) - Gloss identifying biological sex
- EDAD_CANT (INT) - Numerical record of the patient's age at admission
- COMUNA (VARCHAR(100)) - Gloss of the commune of residence of the deceased, according to the political-administrative division updated in 2019
- NOMBRE_REGION (VARCHAR(100)) - Gloss of the region of residence
- LUGAR_DEFUNCION (VARCHAR(100)) - Describes the place where the death occurs
- PERIODO (VARCHAR(7)) - Period of the death date
- RANGO_ETARIO (VARCHAR(20)) - Age range of the deceased

✍🏻 Insert Data from Terminal

  • Since the data we have is quite large, we need to insert it through the terminal to avoid issues when using the editor.
psql -U postgres -d mi_base_de_datos -h localhost -p 5432 -f /home/usuario/Chile-COVID-Mortality-2020-2024/postgres/tables/add_into_tables.sql

πŸ› οΈ Example of a Query

select 
cd."AΓ‘O" as "year",
cd."NOMBRE_REGION" as "region",
count(cd."EDAD_CANT") as "quantity",
sum(case when cd."SEXO_NOMBRE" = 'Mujer' then 1 else 0 end) as "Women",
sum(case when cd."SEXO_NOMBRE" = 'Hombre' then 1 else 0 end) as "Men"
from covid_chile.covid_data cd
group by "year", "region"
order by "year";

πŸ“Š Data Analysis

Yearly Deaths by Gender and Region (PDF table)

analysis1

Yearly Deaths by Gender (Graphic)

analysis2

Yearly Deaths by Region (Graphic)

analysis3


πŸŽ–οΈ Acknowledgments

The data used in this project, including COVID-19 mortality records in Chile from 2020 to 2024, is provided by the Chilean government’s open data platform:

acknowledgment1

Special thanks to Pamela Suarez (deis@minsal.cl) for creating and curating the dataset. I acknowledge and appreciate her work in making this information publicly accessible.


Thanks to the Ministerio de Salud de Chile and related governmental agencies for making this data publicly available.

About

This project focuses on analyzing COVID-19 mortality data in Chile from 2020 to 2024. With at least 60,000 records, the dataset provides critical insights into the impact of the pandemic on the population.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages