Visualization and Classification of Pneumonia using 3D CT Images

Introduction

This repository contains a simple machine learning workflow consisting of data ingestion and preparation, model training, serving and monitoring. It represents how computer-aided diagnosis can be used for the prediction of pneumonia from a collection (a volume) of CT images.

It is based on work by Hasib Zunair.

Technologies Used

Openshift
OpenDataHub
Jupyter Notebooks with iPyWidgets
Numpy
Tensorflow
Python requests library
Seldon Core
Prometheus
Grafana

Relevant Files and Directories

├── 01-inference-3d-image-classification-cli.py        Python script forinferencing
├── 01-inference-3d-image-classification.ipynb         Visualization and inferencing
├── 02-training-3d-image-classification.ipynb          Notebook script for training
├── 02-training-3d-image-classification.py             Python script for training
├── 3d_image_classification.h5                         Trained model artifact
├── Dockerfile                                         For s2i builds
├── MyModel.py                                         Seldon Model Server Code
├── ct-data.zip                                        Validation data for inferencing
├── requirements-notebook.txt
├── requirements.txt
└── resources                                          Kubernetes Objects
    ├── 06-seldon-mymodel-servicemonitor.yaml
    ├── 07-mymodel-seldon-deploy-from-quay.yaml
    └── grafana-dashboards
        ├── NVIDIA-DCGM-dashboard.json                 GPU Metrics
        └── seldon-dashboard.json                      Model Server Metrics

Tested Environment

OpenDataHub (ODH) v1.3 for JupyterHub support

Openshift Container Platform (OCP) v4.10.26

Model Serving Workflow

Model Development and Training Workflow

Train a 17-layer, Convolutional Neural Network to predict the presence of COVID-19 related pneumonia from 3D CT imagery.

Build the training Python stack.

pip install pip tensorflow nibabel matplotlib -Uq

Model Card

(200) COVID-19 related 3D CT image studies
Each study contains 36-54 slices of 512x512 pixels (voxels) each.
Total size is ~2GB (compressed)
~20 minutes to preprocess and train on an NVIDIA Tesla T4 GPU
ML framework: Keras/Tensorflow

Data Source: Chest CT Scans with COVID-19 Related Findings.

Setup and Configuration

Openshift

Model Server side

Change to the resources directory.

cd 3d-image-classification/resources

Create a project called ml-mon

oc new-project ml-mon

Using the Openshift console UI, install an instance of the following community operators from OperatorHub into the ml-mon namespace.

OpenDataHub
- JupyterHub, S3, ODH Dashboard
Prometheus
Grafana

Seldon

Install the Seldon Core operator into all namespaces in the cluster (default).

Create an instance of Prometheus and Grafana in the ml-mon namespace.

Expected Output

oc get pods -n ml-mon -w

NAME                                   READY   STATUS    RESTARTS   AGE
$ oc get pods -n ml-mon             

NAME                                                   READY   STATUS    RESTARTS   AGE
grafana-deployment-8fbf7c944-7895m                     1/1     Running   0          5h35m
grafana-operator-controller-manager-6ff698d9fc-xvk28   2/2     Running   0          5h35m
prometheus-example-0                                   2/2     Running   0          5h35m
prometheus-operator-7b9ccd45c6-7v8td                   1/1     Running   0          5h35m

Create routes for Prometheus and Grafana.

oc expose svc prometheus-operated
oc expose svc grafana-service

Obtain the Grafana admin credentials to login to the Grafana console.

oc get secrets grafana-admin-credentials -o=jsonpath='{@.data.GF_SECURITY_ADMIN_USER}' | base64 --decode

admin

oc get secrets grafana-admin-credentials -o=jsonpath='{@.data.GF_SECURITY_ADMIN_PASSWORD}' | base64 --decode

ABcdRqpfdsEfpg==

Create a Prometheus Service Monitor

oc create -f 06-seldon-mymodel-servicemonitor.yaml

servicemonitor.monitoring.coreos.com/mymodel-mygroup created

Login to the Grafana console. The username and password can be obtained from the grafana-admin-credentials secret.
Within Grafana, configure a Prometheus data source called prometheus with a URL of prometheus-operated.ml-mon:9090
Import the Seldon dashboard from the resources/seldon-dashboard.json file.
Deploy the Seldon model server and wait for the classifier pod to become ready. Two services should be created by the Seldon deployer.

oc create -f 07-mymodel-seldon-deploy-from-quay.yaml

seldondeployment.machinelearning.seldon.io/mymodel created

oc get pods

NAME                                            READY   STATUS    RESTARTS   AGE
mymodel-mygroup-0-classifier-57647887d9-98qqb   2/2     Running   0          118s

oc get services

NAME                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
mymodel-mygroup              ClusterIP   10.217.5.143   <none>        8000/TCP,5001/TCP   20s
mymodel-mygroup-classifier   ClusterIP   10.217.4.127   <none>        9000/TCP            2m4s

Create a route for the Seldon model server.

oc expose svc mymodel-mygroup

Curl the prometheus endpoint and confirm it is able to scrape metrics from the classifier pod.

curl -X GET $(oc get route mymodel-mygroup -o jsonpath='{.spec.host}')/prometheus

...
promhttp_metric_handler_requests_total{code="200"} 5

OpenDataHub and Jupyter Client Configuration

Jupyter Notebook dependencies

pip install tensorflow jupyterlab ipywidgets scipy

Login to OpenDataHub
Start the JupyterHub server and choose the Standard Data Science notebook image.
Clone this github repo
Run the 01-inference-3d-image-classification notebook.
Find the notebook cell with predict function and modify the url variable to point to the route that was created.
- echo $(oc get route mymodel-mygroup -o jsonpath='{.spec.host}')/api/v1.0/predictions
Run the notebook and select a study to make a few predictions to trigger Seldon activity.

Within 30 seconds or so there should be activity on the Seldon Grafana Dashboard.

Optionally, configure Grafana to watch Openshift's built-in Prometheus Data Source so a GPU dashboard can be created. This data source will scrape metrics from the NVIDA DCGM exporter.

Grant the Grafana service account name the cluster-reader role so it can use Openshift's Prometheus in the openshift-monitoring namespace.

oc adm policy add-cluster-role-to-user cluster-monitoring-view -z grafana-service-account -n ml-mon

Get the Prometheus token.

oc serviceaccounts get-token prometheus-k8s -n openshift-monitoring

Add this token to the example Grafana data source yaml.

httpHeaderValue1: 'Bearer ${BEARER_TOKEN}'

Create the data source object.

oc apply -f 03-prometheus-grafanadatasource.yaml

Import the Seldon and GPU dashboards from the included json files.

Open The Prometheus and Grafana Dashboards to visualize the API activity.

Trouble Shooting

How to confirm that Proetheus is scraping metrics from Seldon.

curl -X GET $(oc get route mymodel-mygroup -o jsonpath='{.spec.host}')/prometheus

seldon_api_executor_server_requests_seconds_sum{code="200",deployment_name="mymodel",method="post",predictor_name="mygroup",predictor_version="",service="predictions"} 4.714845908
seldon_api_executor_server_requests_seconds_count{code="200",deployment_name="mymodel",method="post",predictor_name="mygroup",predictor_version="",service="predictions"} 5

$ oc create -f resources/07-mymodel-seldon-deploy-from-quay.yaml
Error from server (InternalError): error when creating "resources/07-mymodel-seldon-deploy-from-quay.yaml": Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io": Post "https://seldon-webhook-service.odh.svc:443/validate-machinelearning-seldon-io-v1-seldondeployment?timeout=30s": service "seldon-webhook-service" not found

This can happen after ODH has been re-installed into a different project. To fix it delete the old webhook.

oc get MutatingWebhookConfiguration,ValidatingWebhookConfiguration -A

oc delete validatingwebhookconfiguration.admissionregistration.k8s.io/seldon-validating-webhook-configuration-odh

Developer Notes (Optional)

Building the Seldon deployer container image using OpenShift's s2i workflow.

Create and start a new build.

cd 3d-image-classification

oc new-build --strategy docker --docker-image registry.redhat.io/ubi8/python-36 --name mymodel -l app=mymodel --binary

oc start-build mymodel --from-dir=. --follow

oc get is

NAME      IMAGE REPOSITORY                                                     TAGS     UPDATED
mymodel   image-registry.openshift-image-registry.svc:5000/bk-models/mymodel   latest   7 seconds ago

Edit mymodel-seldon-deploy.yaml to confirm that the image location matches what the image stream reports. Then deploy the model server and wait for the pod to become ready.

oc apply -f resources/mymodel-seldon-deploy.yaml

oc get pods

NAME                                            READY   STATUS              RESTARTS   AGE
mymodel-mygroup-0-classifier-7c6b44569c-qmzk6   2/2     Running             0          61s

Expose the service

oc expose svc <svc-name>

To trigger a redeploy after a new build. This does not always work so the pod may have to be deleted.

oc patch deployment <deployment-name> -p "{\"spec\": {\"template\": {\"metadata\": { \"labels\": {  \"redeploy\": \"$(date +%s)\"}}}}}"

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
3d_image_classification/1		3d_image_classification/1
images		images
resources		resources
.DS_Store		.DS_Store
.gitignore		.gitignore
01-training-3d-image-classification.ipynb		01-training-3d-image-classification.ipynb
01-training-3d-image-classification.py		01-training-3d-image-classification.py
02-inference-3d-image-classification-cli.py		02-inference-3d-image-classification-cli.py
02-inference-3d-image-classification.ipynb		02-inference-3d-image-classification.ipynb
02-inference-local-3d-image-classification-cli.py		02-inference-local-3d-image-classification-cli.py
02-inference-triton-3d-image-classification.ipynb		02-inference-triton-3d-image-classification.ipynb
Dockerfile		Dockerfile
LICENSE		LICENSE
MyModel.py		MyModel.py
README.md		README.md
requirements-notebook.txt		requirements-notebook.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visualization and Classification of Pneumonia using 3D CT Images

Introduction

Technologies Used

Relevant Files and Directories

Tested Environment

OpenDataHub (ODH) v1.3 for JupyterHub support

Openshift Container Platform (OCP) v4.10.26

Model Serving Workflow

Model Development and Training Workflow

Model Card

Setup and Configuration

Openshift

Model Server side

OpenDataHub and Jupyter Client Configuration

Trouble Shooting

How to confirm that Proetheus is scraping metrics from Seldon.

Developer Notes (Optional)

Building the Seldon deployer container image using OpenShift's s2i workflow.

Create and start a new build.

About

Releases

Packages

Languages

License

bkoz/3d-image-classification

Folders and files

Latest commit

History

Repository files navigation

Visualization and Classification of Pneumonia using 3D CT Images

Introduction

Technologies Used

Relevant Files and Directories

Tested Environment

OpenDataHub (ODH) v1.3 for JupyterHub support

Openshift Container Platform (OCP) v4.10.26

Model Serving Workflow

Model Development and Training Workflow

Model Card

Setup and Configuration

Openshift

Model Server side

OpenDataHub and Jupyter Client Configuration

Trouble Shooting

How to confirm that Proetheus is scraping metrics from Seldon.

Developer Notes (Optional)

Building the Seldon deployer container image using OpenShift's s2i workflow.

Create and start a new build.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages