Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Thesis(text mostly in greek with abstract in english): link

Installations

For Local Development in Linux

Go
Docker
Docker Compose
kind - Kubernetes in Docker
Install the project's requirements, preferably in a Python virtual env:

pip install -r requirements.txt

For a Production Environment

Create a swarm cluster and clone the repo to the master node
Create a Kubernetes cluster and clone the repo to the master node
Install the project's requirements in each cluster, preferably in a Python virtual env:

pip install -r requirements_production.txt

Configs to switch from prod to local and vice versa

Folder/File	Description
/serrano_app/docker-compose.yaml	environment/KAFKA_ADVERTISED_LISTENERS
/serrano_app/flask-start.sh	flask run
/serrano_app/config/	kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/	kafka_cfg/bootstrap.servers
/kubernetes_driver_app/config/	kafka_cfg/kubeconfig_yaml
/kubernetes_driver_app/config/	kafka_cfg/broker
/swarm_driver_app/config/	kafka_cfg/bootstrap.servers
/swarm_driver_app/config/	kafka_cfg/broker

Quickstart

Run Docker Compose on a terminal:
```
cd serrano_app/
docker-compose up -d
```
Check that everything is up and running:
```
docker-compose ps
```
Establish the MongoDB Sink connection
To start the serrano_app, from the /serrano_app folder:
- Start the Faust agents for the Dispatcher, Resource Optimization Toolkit and Orchestrator components:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
- Start the Flask app to receive requests for an app deployment or termination:
```
./flask-start.sh
```

Create an application request:

Deployment

<python3_path> ./deploy.py -yamlSpec <absolute path to a YAML file> -env <local or prod>

Termination

<python3_path> ./remove.py -requestUUID <requestUUID> -env <local or prod>

Activate the Faust agent of the swarm or the Kubernetes driver to process the request:
- For the swarm driver, connect to the manager node and from the swarm_driver_app folder:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
- For the Kubernetes driver, connect to the manager node and from the kubernetes_driver_app folder:
```
<python3_path> ./src/__main__.py worker --loglevel=INFO
```
To check that data is published to the respective topics:
- kafkacat -b localhost:9092 -t dispatcher
- kafkacat -b localhost:9092 -t resource_optimization_toolkit
- kafkacat -b localhost:9092 -t orchestrator
- kafkacat -b localhost:9092 -t swarm
- kafkacat -b localhost:9092 -t kubernetes

Check created namespace, services and deployments in Kubernetes:

kubectl get namespaces
kubectl get services
kubectl get deployments

Check stack creation in the swarm:
```
docker stack ls
```

Files' & Folders' Highlights

Folder/File	Description
/serrano_app	implementation of the Dispatcher, Resource Optimization Toolkit and Orchestrator components
/swarm_driver_app	implementation of the swarm driver component
/kubernetes_driver_app	implementation of the Kubernetes driver component
flask-start.sh	script to run the flask app (mod +x)
src/main.py	faust application entrypoint
src/faust_app.py	creates an instance of the Faust for stream processing
src/flask_app.py	creates an instance of the Flask
models.py	Faust models to describe the data in the streams
helpers.py	helping functions
agents.py	Faust async stream processors of the Kafka topics
swarm_driver.py	SwarmDriver class to connect and interact with a swarm
kubernetes_driver.py	K8sDriver class to connect and interact with a Kubernetes cluster

Establish the MongoDB Sink Connection (for upsert)

In this example, the MongoDB is in MongoDB Atlas.

curl -X PUT http://localhost:8083/connectors/sink-mongodb-users/config -H "Content-Type: application/json" -d ' {
      "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
      "tasks.max":"1",
      "topics":"db_consumer",
      "connection.uri":"mongodb://athina_kyriakou:123@ntua-thesis-cluster-shard-00-00.xcgej.mongodb.net:27017,ntua-thesis-cluster-shard-00-01.xcgej.mongodb.net:27017,ntua-thesis-cluster-shard-00-02.xcgej.mongodb.net:27017/myFirstDatabase?ssl=true&replicaSet=atlas-xirr44-shard-0&authSource=admin&retryWrites=true&w=majority",
      "database":"thesisdb",
      "collection":"db_consumer",
      "key.converter":"org.apache.kafka.connect.json.JsonConverter",
      "key.converter.schemas.enable":false,
      "value.converter":"org.apache.kafka.connect.json.JsonConverter",
      "value.converter.schemas.enable":false,
      "document.id.strategy":"com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
      "document.id.strategy.partial.value.projection.list":"requestUUID",
      "document.id.strategy.partial.value.projection.type":"AllowList",
      "writemodel.strategy":"com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy"
}'

Detailed info here

Useful Commands

Kafka

Create a topic:

docker-compose exec broker kafka-topics \
  --create \
  --bootstrap-server localhost:9092 \
  --replication-factor 1 \
  --partitions 1 \
  --topic kdriver

Kafkacat

CLI used for debugging & testing. Documentation

Check a topic's content:

kafkacat -b localhost:9092 -t <topic_name>

Write to the orchestrator topic:

kafkacat -b localhost:9092 -t orchestrator -P
{"requestUUID": XX, "action": XXX, "yamlSpec":XXXX}

Swarm

CLI Documentation

Get created stacks:

docker stack ls

Delete a stack:

docker stack rm <stack name>

Check nodes' information (engine version, hostname, status, availability, manager status):

docker node ls

Check the labels of a node (and other info):

docker node inspect <node hostname> --pretty

Add a label to a node:

docker node update --label-add <label name>=<value>

Check the node that a service is running on:

docker service ps <service name>

Create a swarm stack based on a YAML file:

docker stack deploy --compose-file <yaml file path> <stack name>

Kubernetes with kind

Create a cluster:

kind create cluster
kind get clusters (dflt name for created clusters: kind)

To have terminal interaction with the created Kubernetes objects, mainly for debugging, install kubectl.

Kubernetes with kubectl

Get deployments in a namespace:

kubectl get deployments --namespace=<name>

Check in which nodes a namespace's pods are runnning:

kubectl get pod -o wide --namespace=<name>

Show the nodes' labels:

kubectl get nodes --show-labels

Add a label to a node:

kubectl label nodes <node name> <label name>=<value>

Create Kubernetes resources based on a YAML file:

kubectl apply -f <yaml file path>

Useful Resources

Kafka

Kafka CheatSheet
Getting Started with Apache Kafka in Python
Basic stream processing using Kafka and Faust
Kafka: Consumer API vs Streams API

Kubernetes

Kubectl Cheat Sheet
How to Access a Kubernetes Cluster
Cluster Access Configuration
Configure kubectl to Access Remote Kubernetes Cluster
Kubeconfig tips with Kubectl

Kubernetes & Python

Python Client
Python Client API Endpoints
Get Started Kubernetes with Python

Faust

The App - Define your Faust project: Spot the suggested structure of medium/large projects here
Example of medium project structure implementation
Models, Serialization, and Codecs
Stream processing with Python Faust: Part II – Streaming pipeline

Kafka Source - MongoDB Sink

MongoDB Kafka Connector
Create a Database in MongoDB Using the CLI with MongoDB Atlas: here & here
MongoDB Write Model Strategies
Kafka Connector used strategy for upsert: ReplaceOneBusinessKeyStrategy
DeleteOne write model strategy

Avro

Introduction to Schemas in Apache Kafka with the Confluent Schema Registry
Create Avro Producers With Python and the Confluent Kafka Library
Consume Messages From Kafka Topics Using Python and Avro Consumer
How to deserialize AVRO messages in Python Faust?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Installations

For Local Development in Linux

For a Production Environment

Configs to switch from prod to local and vice versa

Quickstart

Files' & Folders' Highlights

Establish the MongoDB Sink Connection (for upsert)

Useful Commands

Kafka

Kafkacat

Swarm

Kubernetes with kind

Kubernetes with kubectl

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

Kafka Source - MongoDB Sink

Avro

Files

README.md

Latest commit

History

README.md

File metadata and controls

Development of Resource Orchestration Mechanisms for Heterogeneous Computing Infrastructures

Installations

For Local Development in Linux

For a Production Environment

Configs to switch from prod to local and vice versa

Quickstart

Files' & Folders' Highlights

Establish the MongoDB Sink Connection (for upsert)

Useful Commands

Kafka

Kafkacat

Swarm

Kubernetes with kind

Kubernetes with kubectl

Useful Resources

Kafka

Kubernetes

Kubernetes & Python

Faust

Kafka Source - MongoDB Sink

Avro