Skip to content

This azure function reads multiple files from given datalake folder, deserialize data and merge data from all files together. It can apply filters on data and respond with filtered data in requested format.

Notifications You must be signed in to change notification settings

SurajSomani14/Read-And-Filter-Datalake-Files-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Read And Filter Datalake Files Data

This azure function reads multiple files from given datalake folder, deserialize data and merge data from all files together. It can apply filters on data and respond with filtered data in requested format.

Required external NuGet packages-

  1. Microsoft.NET.Sdk.Functions
  2. Azure.Storage.Files.DataLake
  3. CsvHelper
  4. Parquet.Net
  5. System.Linq.Dynamic.Core

This HTTP triggered azure function receives request body in below format-

image

This function reads from all files from given folder path. Currently, supported formats are csv, json, parquet. Files present in folder can be of any of the three formats provided all are matching in terms of schema. In this function, we are using StudentData class schema to deserialize data, thus all files should have same schema as per the class.

Each file read from folder will deserialize data into common object. Once the data from all files is collected in common list, we will apply filters given in request, if any.

Finally, we will serialize data as per outputFileFormat from request body and respond with file output.

About

This azure function reads multiple files from given datalake folder, deserialize data and merge data from all files together. It can apply filters on data and respond with filtered data in requested format.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages