From a30d5d127d2c18f400c9f5aea5764a7cc5fcd9a9 Mon Sep 17 00:00:00 2001 From: Yuan <45984206+Yuan325@users.noreply.github.com> Date: Wed, 1 Nov 2023 10:14:22 -0700 Subject: [PATCH] feat: add instructions for alloydb (#27) --- README.md | 3 +- cloudrun_instructions.md | 5 +- docs/datastore/alloydb.md | 226 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 228 insertions(+), 6 deletions(-) create mode 100644 docs/datastore/alloydb.md diff --git a/README.md b/README.md index 99e78524..61b3f9ae 100644 --- a/README.md +++ b/README.md @@ -85,8 +85,7 @@ Deploying this demo consists of 3 steps: The extension service uses an interchangeable 'datastore' interface. Choose one of any of the database's listed below to set up and initialize your database: -// TODO: complete this link -* [Set up and configure AlloyDB][] +* [Set up and configure AlloyDB][docs/datastore/postgres.md] ### Deploying the Extension Service diff --git a/cloudrun_instructions.md b/cloudrun_instructions.md index c0a9111c..8e317794 100644 --- a/cloudrun_instructions.md +++ b/cloudrun_instructions.md @@ -13,9 +13,6 @@ * Service Networking * Cloud SQL PostgreSQL instance or AlloyDB cluster and primary instance -## Datastore Setup - - ## Deployment 1. For easier deployment, set environment variables: @@ -116,4 +113,4 @@ --service-account demo-identity ``` - Note: Your organization may not allow unauthenticated requests. Deploy with `--no-allow-unauthenticated` and use the proxy to view the frontend: `gcloud run services proxy demo-service`. \ No newline at end of file + Note: Your organization may not allow unauthenticated requests. Deploy with `--no-allow-unauthenticated` and use the proxy to view the frontend: `gcloud run services proxy demo-service`. diff --git a/docs/datastore/alloydb.md b/docs/datastore/alloydb.md new file mode 100644 index 00000000..ee3a53d3 --- /dev/null +++ b/docs/datastore/alloydb.md @@ -0,0 +1,226 @@ +# Setup and configure AlloyDB + +## Before you begin + +1. Make sure you have a Google Cloud project and billing is enabled. + +1. Set your `PROJECT_ID` environment variable: + + ```bash + export PROJECT_ID= + ``` + +1. [Install](https://cloud.google.com/sdk/docs/install) the gcloud CLI. + +1. Set gcloud project: + + ```bash + gcloud config set project $PROJECT_ID + ``` + +1. Enable APIs: + + ```bash + gcloud services enable alloydb.googleapis.com \ + compute.googleapis.com \ + cloudresourcemanager.googleapis.com \ + servicenetworking.googleapis.com \ + vpcaccess.googleapis.com \ + aiplatform.googleapis.com + ``` +1. Download and install [postgres-client cli (`psql`)][install-psql]. + +[install-psql]: https://www.timescale.com/blog/how-to-install-psql-on-mac-ubuntu-debian-windows/ + +1. Clone this repo to your local machine: + + ```bash + git clone git@github.com:GoogleCloudPlatform/database-query-extension.git + ``` + + +## Enable private services access + +In this step, we will enable Private Services Access so that AlloyDB is able to +connect to your VPC. You should only need to do this once per VPC (per project). + +1. Set environment variables: + + ```bash + export RANGE_NAME=my-allocated-range-default + export DESCRIPTION="peering range for alloydb-service" + ``` + +1. Create an allocated IP address range: + + ```bash + gcloud compute addresses create $RANGE_NAME \ + --global \ + --purpose=VPC_PEERING \ + --prefix-length=16 \ + --description="$DESCRIPTION" \ + --network=default + ``` + +1. Create a private connection: + + ```bash + gcloud services vpc-peerings connect \ + --service=servicenetworking.googleapis.com \ + --ranges="$RANGE_NAME" \ + --network=default + ``` + + +## Create a AlloyDB cluster + +1. Set environment variables. For security reasons, use a different password for + DB_PASS: + + ```bash + export CLUSTER=my-alloydb-cluster + export DB_PASS=my-alloydb-pass + export INSTANCE=my-alloydb-instance + export REGION=us-central1 + ``` + +1. Create an AlloyDB cluster: + + ```bash + gcloud alloydb clusters create $CLUSTER \ + --password=$DB_PASS\ + --network=default \ + --region=$REGION \ + --project=$PROJECT_ID + ``` + +1. Create a primary instance: + + ```bash + gcloud alloydb instances create $INSTANCE \ + --instance-type=PRIMARY \ + --cpu-count=8 \ + --region=$REGION \ + --cluster=$CLUSTER \ + --project=$PROJECT_ID + ``` + +1. Get AlloyDB IP address: + + ```bash + export ALLOYDB_IP=$(gcloud alloydb instances describe $INSTANCE \ + --cluster=$CLUSTER \ + --region=$REGION \ + --format=json | jq \ + --raw-output ".ipAddress") + ``` + +1. Note the AlloyDB IP address for later use: + + ```bash + echo $ALLOYDB_IP + ``` + +## Set up connection to AlloyDB + +For this section, we will create a Google Cloud Engine VM in the same VPC as the +AlloyDB cluster. We can use this VM to connect to our AlloyDB cluster using +Private IP. + +1. Set environment variables: + + ```bash + export ZONE=us-central1-a + export PROJECT_NUM=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)") + export VM_INSTANCE=alloydb-vm-instance + ``` + +1. Create a Compute Engine VM: + + ```bash + gcloud compute instances create $VM_INSTANCE \ + --project=$PROJECT_ID \ + --zone=$ZONE \ + --machine-type=e2-medium \ + --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY,subnet=default \ + --maintenance-policy=MIGRATE \ + --provisioning-model=STANDARD \ + --service-account=$PROJECT_NUM-compute@developer.gserviceaccount.com \ + --scopes=https://www.googleapis.com/auth/cloud-platform \ + --create-disk=auto-delete=yes,boot=yes,device-name=$VM_INSTANCE,image=projects/ubuntu-os-cloud/global/images/ubuntu-2004-focal-v20231025,mode=rw,size=10,type=projects/$PROJECT_ID/zones/$ZONE/diskTypes/pd-balanced \ + --no-shielded-secure-boot \ + --shielded-vtpm \ + --shielded-integrity-monitoring \ + --labels=goog-ec-src=vm_add-gcloud \ + --reservation-affinity=any + ``` + +1. Create an SSH tunnel through your GCE VM using port forwarding. This will + listen to `127.0.0.1:5432` and forward through the GCE VM to your AlloyDB + instance: + + ```bash + gcloud compute ssh --project=$PROJECT_ID --zone=$ZONE $VM_INSTANCE \ + -- -NL 5432:$ALLOYDB_IP:5432 + ``` + + You will need to allow this command to run while you are connecting to + AlloyDB. You may wish to open a new terminal to connect with. + +1. Verify you can connect to your instance with the `psql` tool. Enter + password for AlloyDB when prompted: + + ```bash + psql -h 127.0.0.1 -U postgres + ``` + +## Initialize data in AlloyDB + +1. While connected using `psql`, create a database and switch to it: + + ```bash + CREATE DATABASE assistantdemo; + \c assistantdemo + ``` + +1. Install [`pgvector`][pgvector] extension in the database: + + ```bash + CREATE EXTENSION vector; + ``` + +[pgvector]: /~https://github.com/pgvector/pgvector + +1. From the root of the project, change into the service directory: + + ```bash + cd database-query-extension/extension_service + ``` +1. Make a copy of `example-config.yml` and name it `config.yml`. + + ```bash + cp example-config.yml config.yml + ``` + +1. Update `config.yml` with your database information. + + ```bash + host: 0.0.0.0 + datastore: + # Example for postgres.py provider + kind: "postgres" + host: 127.0.0.1 + port: 5432 + # Update this with the database name + database: "assistantdemo" + # Update with database user, the default is `postgres` + user: "postgres" + # Update with database user password + password: "my-alloydb-pass" + ``` + +1. Populate data into database: + + ```bash + python run_database_init.py + ```