Fins SSA GenAI Offering

The goal of this repo is to develop and deliver GenAI solutions to enable and accelerate customer GenAI PoC Projects with Databricks

Table of Content

Requirements
PoC Accelerator Short Assets
- Data Ingestion and Preprocessing Architecture Patterns
- End to End GenAI Application Architecture Patterns
When to Use
- Getting Started
Resources
Limitations

Requirements

Databricks workspace with Serverless and Unity Catalog enabled
Python 3.9+

PoC Accelerator Short Assets

Data Ingestion and Preprocessing Architecture Patterns

Input Data Types	Input Data Store	chunking performed	OSS Technolgoy	Component Asset
JSON Text Transcripts	Unity Catalog Volum	None	None	Json Data Ingestion with DLT (python), DLT (SQL)
PDF Doc (with tables)	Unity Catalog Volum	Unstructured chunking strategy	Unstructured	PDF Doc Ingestion
Image Doc (text extraction)	Unity Catalog Volum	Unstructured chunking strategy	Unstructured	Image Doc Ingestion

End to End GenAI Application Architecture Patterns

Input Data	Model	Tasks	GenAI Use Case	Orchestration	Customer Persona	PoC Template
JSON Text Transcripts	Foundation LLM (e.g. Llama3p1)	ummarization, Sentiment, classification	AI Function, DBSQL Agent	DLT, LangChain	Data Analyist, Data Scientist	Call Center Transcript Analytics with AI
JSON Text Transcripts	Foundation LLM (e.g. Llama3p1)	ummarization, Sentiment	RAG	DLT, LangChain	Data Scientist, MLE, Data Engineer	Call Center Transcript RAG Apps
wav Audio	Foundation LLM (e.g. Llama3p1)	Speech Transcription, Summarization, Sentiment	RAG	DLT, LangChain	Data Scientist, MLE, Data Engineer	Call Center Audio to Text RAG Apps
PDF Documents	Foundation LLM (e.g. Llama3p1)	Unstructured Data Processing, Name Entity Recognition	NER	DLT, Function calling	Data Scientist, MLE, Data Engineer	PDF_Doc Ingestion

When to Use

You have a business use case that can potentially apply generative AI technology and fall into one of the PoC accelerator template. You have access to a unity catalog enabled Databricks Workspace.

You may have some existing data available in the workspace to use as input data. If you don't have any data, the PoC accelerator templates contains synthetic sample datasets to enable the demonstration of genAI application's functionalities

Getting Started

Clone this repo and add the repo to your Databricks Workspace. Refer to Databricks Repo Setup for instuctions on how to create Databricks repo on your workspace

Got into the folder of the selected PoC accelerator template
Review the architecture diagram in the README
Start with the instruction notebook
Follow the instructions in the instruction notebook.
Most of notebook can run by click Run/Run ALL but some may require additional steps of using databricks UI so be sure to read the instruction

Resources

Limitations

The PoC accelerator template is designed for use Unit Catalog managed workspace only.
The synthetic dataset provided by Databricks are generated algorithmatically based on assumptions and they are not real-life data.
Delta Live Table technology from Databricks is used in some of PoC Accelerator Template, Currently the live table (a.k.a materialized view) from Delta Live Table cannot only be accessed by shared clusters, therefore, a copy of the materialized views are being used in some of notebooks. The limitation will be addressed in the future product releases

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
Data_Ingestion		Data_Ingestion
call_center_genAI_apps		call_center_genAI_apps
datasets		datasets
imgs		imgs
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fins SSA GenAI Offering

Table of Content

Requirements

PoC Accelerator Short Assets

Data Ingestion and Preprocessing Architecture Patterns

End to End GenAI Application Architecture Patterns

When to Use

Getting Started

Resources

Limitations

About

Releases

Packages

Contributors 2

Languages

qian-yu-db/Fins-SSA-GenAi-Offerings

Folders and files

Latest commit

History

Repository files navigation

Fins SSA GenAI Offering

Table of Content

Requirements

PoC Accelerator Short Assets

Data Ingestion and Preprocessing Architecture Patterns

End to End GenAI Application Architecture Patterns

When to Use

Getting Started

Resources

Limitations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages