Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support replicated architecture w/ sentinel #50

Merged
merged 77 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
2326622
it deploys
JoeHCQ1 Nov 14, 2024
212305a
the test deploys
JoeHCQ1 Nov 15, 2024
d6a9b52
tests pass if you don't try twice :D
JoeHCQ1 Nov 15, 2024
de001bb
test passes
JoeHCQ1 Nov 16, 2024
2c31e45
upgraded tests
JoeHCQ1 Nov 16, 2024
aa5b5ad
handled some todos
JoeHCQ1 Nov 16, 2024
f2f4285
changes after looking at the diff
JoeHCQ1 Nov 16, 2024
77c1b5f
Improved comment
JoeHCQ1 Nov 16, 2024
c997d38
improved comment
JoeHCQ1 Nov 16, 2024
3a73dbf
linting fixes
JoeHCQ1 Nov 16, 2024
96edbcd
added newline
JoeHCQ1 Nov 16, 2024
28f0afa
ignore shellcheck error b/c this is what I wanted
JoeHCQ1 Nov 16, 2024
e749328
added copyright notice
JoeHCQ1 Nov 16, 2024
5d98455
make it so chainguard option can go forward
JoeHCQ1 Nov 16, 2024
d3287bb
Bumped timeout to 60 mintues
JoeHCQ1 Nov 18, 2024
81bceb0
removed doug user setup b/c not relevant to valkey
JoeHCQ1 Nov 18, 2024
d1b3779
Moved it to the big one
JoeHCQ1 Nov 18, 2024
1e74455
try to get around a syntax error
JoeHCQ1 Nov 18, 2024
41e03d3
removed sh undefined array
JoeHCQ1 Nov 18, 2024
0ce80a9
this may have fixed it
JoeHCQ1 Nov 18, 2024
999c96b
Update .github/workflows/test.yaml
JoeHCQ1 Nov 18, 2024
8631b8a
Added license to namespace
JoeHCQ1 Nov 18, 2024
9775e0e
Allow egress too
JoeHCQ1 Nov 18, 2024
f5adf6f
pr fixes
JoeHCQ1 Nov 18, 2024
46b55cb
Update tests/zarf.yaml
JoeHCQ1 Nov 18, 2024
dadc722
Reverted to bash script
JoeHCQ1 Nov 18, 2024
41bd768
try to remove the bad substitution error
JoeHCQ1 Nov 18, 2024
45c4a11
Removed test bundle target
JoeHCQ1 Nov 18, 2024
d6785e1
fixed the bad substitution
JoeHCQ1 Nov 18, 2024
04798c2
Added line to debug failed jobs
JoeHCQ1 Nov 18, 2024
4ea6360
get logs of failed job too
JoeHCQ1 Nov 18, 2024
44dc6ae
fixed debug msg to get clarity on why pod fails in CI only
JoeHCQ1 Nov 19, 2024
2fed996
try a bigger machine
JoeHCQ1 Nov 19, 2024
0107f2b
Merge branch 'main' into add-replicated-support
Racer159 Nov 19, 2024
d405abc
made script more robust across shell versions
JoeHCQ1 Nov 19, 2024
90dbbd8
made bash more transferrable
JoeHCQ1 Nov 20, 2024
aad4c50
removed test namespace pre-creation b/c I'm fairly sure we've tested …
JoeHCQ1 Nov 20, 2024
9c71ed7
added back in architecture input to make it work on a mac
JoeHCQ1 Nov 20, 2024
56ca4b6
Change the way the network stuff is being enabled
JoeHCQ1 Nov 20, 2024
81c105d
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 20, 2024
fff21fc
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 20, 2024
2476a52
removed the rest of the valkey-standalone namespace
JoeHCQ1 Nov 20, 2024
cc44bd8
Added debug job to figure out what is going on"
JoeHCQ1 Nov 20, 2024
2541a5b
Revert "Added debug job to figure out what is going on""
JoeHCQ1 Nov 20, 2024
daee640
inserted tmate upstream
JoeHCQ1 Nov 20, 2024
8dd6cf2
wip
JoeHCQ1 Nov 21, 2024
07a412f
this may work
JoeHCQ1 Nov 21, 2024
3bad800
up the backoff limit
JoeHCQ1 Nov 21, 2024
0a4195d
Remove notes from readme
JoeHCQ1 Nov 21, 2024
5126f14
reverted callable test to non-tmate commit
JoeHCQ1 Nov 21, 2024
6b3531b
test to see if this works
JoeHCQ1 Nov 21, 2024
136d2e6
compare performance on the 4-core - 10 minute install on 16 core
JoeHCQ1 Nov 21, 2024
2892fae
try 8 core
JoeHCQ1 Nov 21, 2024
5656a02
Update .github/workflows/test.yaml
JoeHCQ1 Nov 22, 2024
aa6162f
Update .github/workflows/test.yaml
JoeHCQ1 Nov 22, 2024
1054432
Update .github/workflows/test.yaml
JoeHCQ1 Nov 22, 2024
4244094
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 22, 2024
8cd5a59
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 22, 2024
3cdd243
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 22, 2024
62864b6
Update bundle/uds-bundle.yaml
JoeHCQ1 Nov 22, 2024
28dc594
Update chart/values.yaml
JoeHCQ1 Nov 22, 2024
f7cd5e7
Update tasks.yaml
JoeHCQ1 Nov 22, 2024
5533d29
Update tasks.yaml
JoeHCQ1 Nov 22, 2024
ded237a
Removed 60 second wait
JoeHCQ1 Nov 22, 2024
3ee4479
Swapped calls to kubectl with calls to uds
JoeHCQ1 Nov 22, 2024
4ff4458
Update tasks/test.yaml
JoeHCQ1 Nov 22, 2024
d3a8126
Update tasks/test.yaml
JoeHCQ1 Nov 22, 2024
5ffb550
Update tasks/test.yaml
JoeHCQ1 Nov 22, 2024
602cf56
Update tasks/test.yaml
JoeHCQ1 Nov 22, 2024
4f3ccb1
using a for loop instead of code duplication
JoeHCQ1 Nov 22, 2024
f7a1ab9
Added docs
JoeHCQ1 Nov 22, 2024
5508428
formatting improvements
JoeHCQ1 Nov 22, 2024
1ca5703
Added disclaimer next to less than ideal security behavior
JoeHCQ1 Nov 22, 2024
f68ba61
removed an awkward use of 'still'
JoeHCQ1 Nov 22, 2024
368dbdb
Update docs/configuration.md
JoeHCQ1 Nov 22, 2024
f399195
Update configuration.md
JoeHCQ1 Nov 22, 2024
be45455
Update tests/zarf.yaml
JoeHCQ1 Nov 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,6 @@ jobs:
upgrade-flavors: ${{ needs.check-flavor.outputs.upgrade-flavors }}
flavor: ${{ matrix.flavor }}
type: ${{ matrix.type }}
timeout: 60
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
runsOn: uds-swf-ubuntu-big-boy-8-core
secrets: inherit # Inherits all secrets from the parent workflow.
63 changes: 58 additions & 5 deletions bundle/uds-bundle.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,24 +18,77 @@ packages:
overrides:
valkey:
uds-valkey-config:
namespace: valkey-standalone
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
values:
- path: custom
value:
- direction: Ingress
selector:
app.kubernetes.io/name: valkey
remoteNamespace: valkey-cli
remoteNamespace: valkey-test
remoteSelector:
app: valkey-cli
app: valkey-test
port: 6379
description: "Ingress from Valkey CLI (for tests)"
- path: copyPassword
value:
enabled: true
namespace: valkey-cli
secretName: valkey
secretKey: valkey-password
namespace: valkey-test
secretName: valkey-standalone
secretKey: REDISCLI_AUTH # This allows us to mount it in as an env var and the valkey-cli picks it right up
valkey:
namespace: valkey-standalone
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
variables:
- name: VALKEY_RESOURCES
path: "master.resources"
default:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
- name: valkey
path: ../
# x-release-please-start-version
ref: 8.0.1-uds.0
# x-release-please-end
overrides:
valkey:
uds-valkey-config:
namespace: valkey-replicated-w-sentinel
values:
- path: custom
value:
- direction: Ingress
selector:
app.kubernetes.io/name: valkey
remoteNamespace: valkey-test
remoteSelector:
app: valkey-test
port: 6379
description: "Ingress from Valkey CLI (for tests)"
- direction: Ingress
selector:
app.kubernetes.io/name: valkey
remoteNamespace: valkey-test
remoteSelector:
app: valkey-test
port: 26379
description: "Ingress from Valkey CLI (for tests) sentinel"
- path: copyPassword
value:
enabled: true
namespace: valkey-test
secretName: valkey-replicated-w-sentinel
secretKey: REDISCLI_AUTH
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
valkey:
namespace: valkey-replicated-w-sentinel
values:
- path: architecture
value: replication
- path: sentinel.enabled # /~https://github.com/bitnami/charts/blob/main/bitnami/valkey/values.yaml#L1143
value: true
variables:
- name: VALKEY_RESOURCES
path: "master.resources"
Expand Down
37 changes: 37 additions & 0 deletions chart/templates/networking.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Copyright 2024 Defense Unicorns
# SPDX-License-Identifier: AGPL-3.0-or-later OR LicenseRef-Defense-Unicorns-Commercial

# This removes the NR filter_not_found error that you get when you try to access
# the primary node.
apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
name: valkey-headless-entry
namespace: {{ .Release.Namespace }}
spec:
hosts:
- "*.valkey-headless.{{ .Release.Namespace }}.svc.cluster.local" # Matches pod-specific DNS names
ports:
- number: 6379
name: redis
protocol: TCP
location: MESH_INTERNAL
resolution: NONE
---
# This enables comms that are IP based: /~https://github.com/istio/istio/issues/37423
# The read replicas use each-other's IPs to talk once the sentinels tell them what
# the IPs are.
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: valkey-clustering-exception
namespace: {{ .Release.Namespace }}
spec:
mtls:
mode: STRICT
portLevelMtls:
6379:
mode: PERMISSIVE
selector:
matchLabels:
app.kubernetes.io/name: valkey
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 0 additions & 3 deletions chart/templates/uds-package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,10 @@ spec:
targetPort: 9121
portName: http-metrics
description: Metrics

network:
allow:
- direction: Ingress
remoteGenerated: IntraNamespace
- direction: Egress
remoteGenerated: IntraNamespace
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved

# Custom rules to allow clients to connect
{{- range .Values.custom }}
Expand Down
19 changes: 0 additions & 19 deletions common/zarf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,3 @@ components:
url: oci://registry-1.docker.io/bitnamicharts/valkey
valuesFiles:
- ../values/values.yaml
actions:
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
onDeploy:
after:
- description: Validate Valkey Package
maxTotalSeconds: 300
wait:
cluster:
kind: packages.uds.dev
name: valkey
namespace: valkey
condition: "'{.status.phase}'=Ready"
- description: Valkey to be Healthy
maxTotalSeconds: 90
wait:
cluster:
kind: pod
name: app.kubernetes.io/name=valkey
namespace: valkey
condition: Ready
8 changes: 6 additions & 2 deletions tasks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,14 @@ tasks:
- name: create-deploy-test-bundle
description: Test and validate cluster is deployed with Valkey
actions:
- task: create-dev-package
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
- task: create:test-bundle
- task: deploy:test-bundle
- task: setup:create-doug-user
- task: test:all

- name: test-bundle
description: Test the deployed test bundle vai the test package
actions:
Racer159 marked this conversation as resolved.
Show resolved Hide resolved
- task: test:all

- name: dev
Expand All @@ -60,7 +65,6 @@ tasks:
- task: upgrade:create-latest-tag-bundle
- task: setup:k3d-test-cluster
- task: deploy:test-bundle
- task: setup:create-doug-user
- task: compliance:validate
- task: create-dev-package
- task: create-deploy-test-bundle
Expand Down
27 changes: 12 additions & 15 deletions tasks/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,19 @@
tasks:
- name: all
actions:
- task: health-check
- task: setup-data-stores
- task: create-test-package
- task: test-valkey

- name: setup-data-stores
- name: create-test-package
description: Create the test package to confirm Valkey is working
inputs:
architecture:
description: The architecture of the package to create
default: ${UDS_ARCH}
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
JoeHCQ1 marked this conversation as resolved.
Show resolved Hide resolved
actions:
- description: Create the data store test package for the Valkey instance
cmd: uds zarf package create tests --confirm --no-progress --architecture="${UDS_ARCH}" --skip-sbom --no-progress
- description: Deploy the test package into the cluster
cmd: uds zarf package deploy "zarf-package-valkey-test-${UDS_ARCH}-0.1.0.tar.zst" --confirm --no-progress
- cmd: uds zarf package create tests --confirm --architecture="${{ .inputs.architecture }}" --skip-sbom

- name: health-check
- name: test-valkey
actions:
- description: Valkey Status
wait:
cluster:
kind: pod
name: app.kubernetes.io/name=valkey
namespace: valkey
condition: Ready
- description: Deploy the test package into the cluster
cmd: uds zarf package deploy "zarf-package-valkey-test-${UDS_ARCH}-0.1.0.tar.zst" --confirm
117 changes: 117 additions & 0 deletions tests/valkey/test-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Copyright 2024 Defense Unicorns
# SPDX-License-Identifier: AGPL-3.0-or-later OR LicenseRef-Defense-Unicorns-Commercial

apiVersion: batch/v1
kind: Job
metadata:
name: valkey-test-standalone
labels:
app: valkey-test
spec:
template:
metadata:
labels:
app: valkey-test
spec:
containers:
- name: valkey-test
image: docker.io/bitnami/valkey:8.0.1-debian-12-r2
envFrom:
- secretRef:
name: valkey-standalone
optional: false
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command: ["/bin/sh", "-c"]
args:
- |
# Set the standalone Valkey host and port
HOST="valkey-master.valkey-standalone.svc.cluster.local"
PORT="6379"

# Check if the Valkey server responds to PING
PING_OUTPUT=$(echo "ping" | valkey-cli -h ${HOST} -p ${PORT})
echo "${PING_OUTPUT}" | grep PONG && echo "Valkey server ponged back" || { echo "Failed to contact Valkey. Output was: ${PING_OUTPUT}"; exit 1; }

# Try to set a key-value pair
SET_OUTPUT=$(echo "set TEST_name ${POD_NAME}" | valkey-cli -h ${HOST} -p ${PORT})
echo "${SET_OUTPUT}" | grep OK && echo "Set value via Valkey" || { echo "Failed to set value via Valkey. Output was: ${SET_OUTPUT}"; exit 1; }

# Try to get the value for the key
GET_OUTPUT=$(echo "get TEST_name" | valkey-cli -h ${HOST} -p ${PORT})
echo "${GET_OUTPUT}" | grep "${POD_NAME}" && echo "Retrieved value via Valkey" || { echo "Failed to retrieve value via Valkey. Output was: ${GET_OUTPUT}"; exit 1; }

echo "All checks passed."
resources: {}
restartPolicy: Never
backoffLimit: 0
status: {}
---
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: valkey-test-replicated-w-sentinel
labels:
app: valkey-test
spec:
completions: 10 # I need to know I didn't accidentally hit the write pod.
parallelism: 1
template:
metadata:
labels:
app: valkey-test
spec:
containers:
- name: valkey-test
image: docker.io/bitnami/valkey:8.0.1-debian-12-r2
envFrom:
- secretRef:
name: valkey-replicated-w-sentinel
optional: false
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command: ["/bin/sh", "-c"]
args:
- |
# Ask the Sentinel which node is the primary node
PRIMARY_ADDR="$(echo 'SENTINEL GET-PRIMARY-ADDR-BY-NAME mymaster' | valkey-cli -h valkey.valkey-replicated-w-sentinel.svc.cluster.local -p 26379)"
echo "Primary ADDR is: ${PRIMARY_ADDR}"

# Extract HOST and PORT using sed
HOST=$(echo "${PRIMARY_ADDR}" | sed -n '1p')
PORT=$(echo "${PRIMARY_ADDR}" | sed -n '2p')
echo "Primary is ${HOST}:${PORT}"

# Check if primary responds to PING
PING_OUTPUT=$(echo "ping" | valkey-cli -h ${HOST} -p ${PORT})
echo "${PING_OUTPUT}" | grep PONG && echo "Primary ponged back" || { echo "Failed to contact primary. Output was: ${PING_OUTPUT}"; exit 1; }

# Try to set a value on the primary
SET_OUTPUT=$(echo "set TEST_name ${POD_NAME}" | valkey-cli -h ${HOST} -p ${PORT})
echo "${SET_OUTPUT}" | grep OK && echo "Set value via primary" || { echo "Failed to set value via primary. Output was: ${SET_OUTPUT}"; exit 1; }

# Ensure replicas return the value
GET_OUTPUT=$(echo "get TEST_name" | valkey-cli -h valkey.valkey-replicated-w-sentinel.svc.cluster.local)
echo "${GET_OUTPUT}" | grep "${POD_NAME}" && echo "Got value via replica" || { echo "Failed to get value via replica. Output was: ${GET_OUTPUT}"; exit 1; }

GET_OUTPUT=$(echo "get TEST_name" | valkey-cli -h valkey.valkey-replicated-w-sentinel.svc.cluster.local)
echo "${GET_OUTPUT}" | grep "${POD_NAME}" && echo "Got value via replica" || { echo "Failed to get value via replica. Output was: ${GET_OUTPUT}"; exit 1; }

GET_OUTPUT=$(echo "get TEST_name" | valkey-cli -h valkey.valkey-replicated-w-sentinel.svc.cluster.local)
echo "${GET_OUTPUT}" | grep "${POD_NAME}" && echo "Got value via replica" || { echo "Failed to get value via replica. Output was: ${GET_OUTPUT}"; exit 1; }

GET_OUTPUT=$(echo "get TEST_name" | valkey-cli -h valkey.valkey-replicated-w-sentinel.svc.cluster.local)
echo "${GET_OUTPUT}" | grep "${POD_NAME}" && echo "Got value via replica" || { echo "Failed to get value via replica. Output was: ${GET_OUTPUT}"; exit 1; }
Racer159 marked this conversation as resolved.
Show resolved Hide resolved

echo "All checks passed."
resources: {}
restartPolicy: Never
backoffLimit: 0
status: {}
25 changes: 20 additions & 5 deletions tests/valkey/uds-package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,31 @@
apiVersion: uds.dev/v1alpha1
kind: Package
metadata:
name: valkey-cli
namespace: valkey-cli
name: valkey-test
spec:
network:
allow:
- direction: Egress
selector:
app: valkey-cli
remoteNamespace: valkey
app: valkey-test
remoteNamespace: valkey-standalone
remoteSelector:
app.kubernetes.io/name: valkey
port: 6379
description: "Egress from Valkey CLI (for tests)"
description: "Egress from Valkey CLI (for tests of standalone instance)"
- direction: Egress
selector:
app: valkey-test
remoteNamespace: valkey-replicated-w-sentinel
remoteSelector:
app.kubernetes.io/name: valkey
port: 6379
description: "Egress from Valkey CLI (for tests of replicated w/ sentinel) to read-port"
- direction: Egress
selector:
app: valkey-test
remoteNamespace: valkey-replicated-w-sentinel
remoteSelector:
app.kubernetes.io/name: valkey
port: 26379
description: "Egress from Valkey CLI (for tests of replicated w/ sentinel) to sentinel-port"
Loading
Loading