Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding prometheus , otel monitoring setup and docs #168

Merged
merged 1 commit into from
Dec 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docker/monitoring/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
version: "3"
services:
otel-collector:
container_name: "otel-collector"
image: otel/opentelemetry-collector-contrib
restart: always
command:
- --config=/etc/otelcol-contrib/otel-config.yaml
volumes:
- ./otel-collector-config.yaml:/etc/otelcol-contrib/otel-config.yaml
ports:
- "1888:1888" # pprof extension
- "8888:8888" # Prometheus metrics exposed by the Collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP http receiver
- "55679:55679" # zpages extension

prometheus:
image: prom/prometheus
container_name: prometheus
restart: always
command:
- --config.file=/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus

grafana:
container_name: grafana
image: grafana/grafana
volumes:
- ./grafana-datasources.yml:/etc/grafana/provisioning/datasources/datasources.yml
ports:
- "3000:3000"

volumes:
prometheus-data:
15 changes: 15 additions & 0 deletions docker/monitoring/grafana-datasources.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: 1

datasources:
- name: Prometheus
type: prometheus
uid: prometheus
access: proxy
orgId: 1
url: http://prometheus:9090
basicAuth: false
isDefault: false
version: 1
editable: false
jsonData:
httpMethod: GET
32 changes: 32 additions & 0 deletions docker/monitoring/otel-collector-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
receivers:
otlp:
protocols:
http:

processors:
# batch metrics before sending to reduce API usage
batch:

exporters:
prometheus:
endpoint: "0.0.0.0:8889"
const_labels:
label: juno


# /~https://github.com/open-telemetry/opentelemetry-collector/blob/main/extension/README.md
extensions:
# responsible for responding to health check calls on behalf of the collector.
health_check:
# fetches the collector’s performance data
pprof:
# serves as an http endpoint that provides live debugging data about instrumented components.
zpages:

service:
extensions: [health_check, pprof, zpages]
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
12 changes: 12 additions & 0 deletions docker/monitoring/prometheus.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
global:
scrape_interval: 10s
evaluation_interval: 10s

scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:8889'] # Otlp
# uncomment to enable prometheus metrics
#- job_name: 'prometheus'
# static_configs:
# - targets: ['localhost:9090'] # Prometheus itself
Binary file added docs/otel_mon.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
59 changes: 59 additions & 0 deletions docs/otel_monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
## Monitor Juno Metrics using Prometheus

#### A simple setup to push the metrics on prometheus using otel-collector is shown below. Grafana can be further used to create visualizations from the available metrics.

<img
src="otel_mon.png"
style="display: margin: 0 auto;">

#### Setup

##### Configure proxy and storage to push metrics to otel endpoint

- Juno proxy and storage services are configured to push the metrics on open telemetry collector endpoint http://localhost:4318/v1/metrics . Add/Update the [OTEL] section in the respective config.toml files

```yaml
[OTEL]
Enabled = true
Environment = "qa"
Host = "0.0.0.0"
Poolname = "junoserv-ai"
Port = 4318
Resolution = 10
UrlPath = "/v1/metrics"
UseTls = false

```

- Now the proxy and storage services are uploading metrics to otel endpoint.

##### Set up otel-collector, prometheus and grafana
- Open telemetry collector, prometheus and grafana are run as docker containers.
- otel-collector , prometheus and grafana configurations are required to be mounted as volumes in the containers
- docker-compose.yaml and configuration files for each of the services available in junodb/docker/monitoring


```bash
cd junodb/docker/monitoring

docker compose up -d
```

- Check the running containers. prometheus, otel-collector and grafana should be running

```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bcb1e7ece6b7 prom/prometheus "/bin/prometheus --c…" 3 hours ago Up 3 hours 0.0.0.0:9090->9090/tcp prometheus
c3816c006f85 otel/opentelemetry-collector-contrib "/otelcol-contrib --…" 3 hours ago Up 3 hours 0.0.0.0:1888->1888/tcp, 0.0.0.0:4317-4318->4317-4318/tcp, 0.0.0.0:8888-8889->8888-8889/tcp, 0.0.0.0:13133->13133/tcp, 0.0.0.0:55679->55679/tcp, 55678/tcp otel-collector
e41e33696606 grafana/grafana "/run.sh" 3 hours ago Up 3 hours 0.0.0.0:3000->3000/tcp grafana

```

- Check the promethus server running at <host_ip>:9090 as shown below. Search for juno metrics.

<img
src="prometheus.png"
style="display: margin: 0 auto;">



Binary file added docs/prometheus.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.