Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
-
Updated
Oct 6, 2023
Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs
Collection of Sample Databricks Spark Notebooks ( mostly for Azure Databricks )
Developed an end-to-end machine learning solution for predicting employee churn using Azure Databricks, leveraging Spark for data processing, MLflow for managing the ML workflow, and deploying the model using Databricks model serving.
Build a movie recommendation data pipeline using Azure services for efficient data ingestion, transformation, and orchestration. Utilize Azure Blob Storage, Azure Databricks, and Azure Data Factory to implement collaborative filtering and PySpark ML for accurate movie recommendations.
A data pipeline project build on databricks and azure to demostrate lifecycle of a cloud data project.
Project Y is a straightforward Landing Zones automated deployment tool dedicated to data processing.
Sentiment Analysis for Tweets
Making ODBC connection from Databricks (Azure Databricks) to Azure SQL Database with Azure AD User Access Token.
Collection of data on Formula One Racing
Springboard Open Ended Capstone
This project builds an End-to-End Azure Data Engineering Pipeline, performing ETL and Analytics Reporting on the AdventureWorks2022LT Database.
This is an End to End Azure Data Engineering project copying data from Rest API to Azure cloud.
Notebook sample of Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data
Ingested Tokyo Olympic data into Azure Data Lake using Azure Data Factory. Enhanced data quality with Apache Spark on Azure Databricks. Optimized SQL queries on Synapse Analytics, reducing execution time. Developed engaging Power BI dashboards, boosting user engagement creating KPI's with DAX.
Using SAS to authenticate and access to ADLS Gen 2 from Azure Databricks
Practice with Azure Synapse Analytics/Databricks Pipeline
To establish a robust data engineering pipeline that leverages rigorous data quality checks to deliver timely, accurate, and reliable data-driven insights for healthcare revenue cycle management (RCM). The pipeline will produce fact and dimension tables to empower reporting teams in generating critical KPIs.
Covid ETL Project using Azure Data Engineering Stack
Tokyo Olympics Analysis Using AZURE platform
Add a description, image, and links to the azuredatabricks topic page so that developers can more easily learn about it.
To associate your repository with the azuredatabricks topic, visit your repo's landing page and select "manage topics."