Introduction

CalypsoAI is designed to provide a comprehensive solution that addresses the security risks inherent in the widespread use of Large Language Models (LLMs.) The following guide is designed to help you install and configure the CalypsoAI solution in your environment.

Deployment Options

There are two (2) common methods of deploying CalypsoAI:

CalypsoAI instance installed on a single compute instance

CalypsoAI supplies access to the Docker image as defined in a provided Docker Compose file
Customer to supply hardware and networking infrastructure

CalypsoAI installed into a highly available Kubernetes Cluster

CalypsoAI supplies access to the Docker image as part of a provided Helm chart
Customer to supply the Kubernetes cluster infrastructure, including all networking and compute nodes

Architecture

CalypsoAI is built and deployed as an on-premises solution. CalypsoAI is packaged as a single container application, which also includes Keycloak (https://www.keycloak.org/) for user authentication. A second container is used with a PostgreSQL (https://www.postgresql.org/) database to store CalypsoAI data. There is a separate installation for the custom scanner functionality. There are a number of different methods of deploying containers into environments and CalypsoAI leaves it up to the customer to choose whichever method makes sense to the organization. Because this is an on-premises solution, there is no data that ever comes back to CalypsoAI.

As an example at CalypsoAI, we deploy into a highly available Kubernetes cluster in a cai-moderator namespace via Helm. Helm is a tool that automates the creation, packaging, configuration, and deployment of Kubernetes applications by combining the configuration files into a single reusable package. If this is your chosen deployment option, CalypsoAI can supply you with the Helm charts.

CalypsoAI Components

The diagram below shows the different components of the CalypsoAI solution:

System Requirements & Prerequisites

Hardware

The following list presents the hardware requirements when implementing a small Proof of Concept for CalypsoAI Moderator and the recommended AWS specific instance type.

Moderator can be run on a CPU only or on a GPU instance. If using custom scanners, that application will need to be run on a GPU instance.

CPU only Instance requirements:

vCPUs - 16
Memory - 32.0 GiB
Clock Speed - 3.5 (GHz)
CPU Architecture - x86_64
Persistent Storage - 100.0 GiB

GPU instance requirements:

vCPUs - 4
Memory - 16.0 GiB
CPU Architecture - x86_64
Persistent Storage - 100.0 GiB
GPU - Nvidia A10G (CUDA Supported)

AWS EC2 instance type recommendation if no GPU's are available: c6i.4xlarge

GPU AWS EC2 instance type recommendation: g5.xlarge

If deploying into a Kubernetes cluster, CalypsoAI recommends a single node for the initial implementation and testing of Moderator and another single GPU node for the cai-scanner.

IMPORTANT NOTE FOR AZURE AKS CLUSTER DEPLOYMENTS:

Due to how Azure provisions GPU enabled nodes in AKS, AKS cluster deployments require very specific nodepool settings to deploy cai-scanner.

The node pool needs to utilize node images from the NCADS_A100_v4 family, which may require a quota limit increase request.

By default, GPU enabled nodes come with preinstalled drivers. In order to avoid compatibility issues, the default gpu driver installation needs to be skipped. This can be done via the command line with the aks-preview extension, which is installed via the below commands:

# Register the aks-preview extension
az extension add --name aks-preview
# Update the aks-preview extension
az extension update --name aks-preview

Once this extension is installed, the node pool can be created via the below command:

az aks nodepool add \
  --resource-group <resource_group> \
  --cluster-name <cluster_name> \
  --name <nodepool_name> \
  --node-count 1 \
  --node-vm-size Standard_NC24ads_A100_v4 \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 1 \
  --labels node_group=cai-scanner \
  --skip-gpu-driver-install

With this node pool up and running, the scanner application can be deployed as outlined later in the guide.

Networking Requirements

The compute instance must have a security group or something similar for network access using the following rules:

Inbound

TCP 8080 (Keycloak Authentication)
TCP 5500 (CalypsoAI Application)
SSH 22 (ssh into instance to perform installation tasks - Optional)

Outbound (used to retrieve software, updates as well as communicate with the LLM provider)

HTTP 80
HTTP 443

If you are using custom scanners, you will need to allow inbound access to the instance running the scanners via port 8000.

SSL Certificates

Keycloak is the mechanism that CalypsoAI uses to authenticate users of the solution. Keycloak requires the use of HTTPS. In order for HTTPS to function properly, SSL certificates are required.

Using a certificate from a trusted Certificate Authority is highly recommended for any production website or application. However, for testing purposes you can use self-signed certificates.

These certificates will need to be added to the Load Balancer of choice. If following the AWS example below, you can import the certificates using Amazon Certificate Manager (ACM). Please see the following as reference:

https://docs.aws.amazon.com/acm/latest/userguide/import-certificate-api-cli.html

Once imported, you will have an opportunity to select these certificates during the creation of the Application Load Balancer.

Self-signed certificates are not issued by a trusted Certificate Authority (CA), so browsers like Google Chrome cannot verify the authenticity of the website. When a user tries to access a website with a self-signed certificate, Chrome displays a security warning, indicating that the connection might not be secure. In order to avoid this error, you will need to make sure that whatever Computer/Browser you are using to access the CalypsoAI Solution, has the self signed certificates installed and trusted.

There are lots of resources that discuss and explain both the concepts and the creation of Self Signed Certificates. One of these articles is referenced here.

Application Load Balancer

CalypsoAI requires the use of a load balancer in order to facilitate the proper http/https redirects to both the CalypsoAI software and Keycloak for authentication. CalypsoAI uses two ports, one for Keycloak authentication (to authenticate the users) and the other for the application itself. The load balancer is used to route the /auth uri to port 8080 and then / to port 5500 on the instance.

The following example walks through the instructions for setting up an Application Load Balancer in AWS. The same concepts can be applied to other cloud providers or local servers such as NGINX.

Creating an Application Load Balancer (ALB) in AWS

For reference: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-application-load-balancer.html

Step 1: Configure Target Groups

CalypsoAI will require the creation of two (2) target groups. Each target group will have a target type of “Instances.”

Target Group #1:

Name: moderator-5500
Protocol/Port: HTTP/5500
VPC: This is the VPC you have your running compute instance.
Protocol Version: HTTP1
Healthcheck Protocol: HTTP
Healthcheck Path: /
Healthcheck Port: Traffic Port

Target Group #2:

Name: moderator-8080
Protocol/Port: HTTP/8080
VPC: This is the VPC you have your running compute instance.
Protocol Version: HTTP1
Healthcheck Protocol: HTTP
Healthcheck Path: /
Healthcheck Port: 5500 (*advanced settings, Override)

Step 2: Create the Application Load Balancer

For this example the ALB will be an Internal Facing ALB. This assumes there are private subnets created within your selected VPC.

Security Group

Inbound

HTTP 80
HTTPs 443

Outbound

CalypsoAI instance security group

Listeners and Routing

You will need to create 2 listeners along with associated rules.

Listener #1

Protocol:Port - HTTP:80
Default Action - Redirect to HTTPs/443

Listener #2

Protocol:Port - HTTPS:443
Default Action - Forward to target group “moderator-5500”
Rule 1 Condition - Path Pattern is /auth/*
Rule 1 Action - Forward to target group “moderator-8080”
Rule 2 Condition - If no other rule applies
Rule 2 Action - Forward to target group “moderator-5500”

DNS Requirements

Unless deploying on a localhost, CalypsoAI requires a qualified Domain Name. This qualified domain name should have an A record that points to either the Application Load Balancer that is setup in the previous step, or to another Load Balancer that has been configured similarly.

Optionally, Domain names or IP addresses on a local computer can be resolved by adding entries in the local hosts file on the compute instances. Entries in the local hosts file have the added advantage that the system can run the application server, even when disconnected from the network. If you are using a hosts file to resolve IP addresses, the file must be configured correctly, and on each computer that will be connecting to the Moderator solution.

The most important thing is the name chosen and configured in either method needs to match exactly with the variable placed inside the .env file created in later steps.

Large Language Model Provider

CalypsoAI does not provide access to any LLM providers by default. The customer must have at least one provider available. The following information is required to configure CalypsoAI:

Provider Name (OpenAI, Cohere, AI21, Azure)
API Key
Model Name

For Azure OpenAI, the following is required:

API Key
Resource Name (Model)
Deployment ID

As stated in the Networking Requirements for the instance, network access over HTTPS:443 is required, and access to the provider API endpoint resource is needed.

User Access Management

CalypsoAI also supports application requests using a full RESTful API. Please see the API section later in this document.

CalypsoAI supports the use of SAML authentication and authorization for Single Sign-On (SSO). Single Sign-On is a feature that allows your users to authenticate their identity once within CalypsoAI using existing credentials. Keycloak, an open-source Identity and Access Management tool developed by Red Hat, handles user authentication. Keycloak supports popular social networks, SAML 2.0 Identity Providers (IdP), an existing OpenID Connect, or Active Directory / LDAP servers.

Installation

Before starting, please ensure you have been provided the following:

Harbor access credentials provided via 1Password
A CalypsoAI Docker image version (tag)

Harbor is used as a private registry where you will be able to pull the CalypsoAI Moderator Docker image or Helm chart from the repositories depending on the preferred installation method.

Option A: Simple Container Management

The instructions below enable you to run CalypsoAI using Docker Compose. This simplifies the deployment by defining the group of services and configuration in a single YAML file for moderator application and Postgresql, and a single YAML file for the custom scanner functionality. Environment variables are passed in from a .env file and a volume is used to provide persistent storage, which saves data between runs.

If using custom scanners, be sure that these prerequisite steps are completed on both the instance being used for CalypsoAI as well as the instance being used for the custom scanners application.

Prerequisite: Install Docker Engine, Docker CLI, and Docker Compose

The steps below are for an Ubuntu Linux distribution. This is only required if you don't have docker installed. See the Docker help documentation for setup on other systems. Docker Desktop is not required.

Step 1. Setup the Docker repository.

# Remove any conflicting packages
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done

# Update the package index
sudo apt-get update

# Install packages to allow apt to use a repository over HTTPS
sudo apt-get install ca-certificates curl gnupg

# Add Docker's GPG key
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | 
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg 

# Setup the repository 
echo \ 
"deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \ 
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \ 
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Update the apt package index 
sudo apt-get update

Step 2. Install Docker Engine, CLI, containerd, and Docker Compose.

Step 3. Optional: Post-installation setup.

https://docs.docker.com/engine/install/linux-postinstall/

Manage Docker as a non-root user
Start Docker on Boot with systemd

Step 4. Verify Docker and Docker Compose installations.

sudo docker version 
 
sudo docker compose version

GPU

If you want to run CalypsoAI with Nvidia GPU, first install the Nvidia driver on the underlying instance. This is a requirement for the cai-scanner (custom scanner functionality):

sudo apt install nvidia-driver-535 --no-install-recommends

Then install the Nvidia Container Toolkit:

Installing the NVIDIA Container Toolkit -- NVIDIA Container Toolkit 1.16.0 documentation

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list \
  && \
    sudo apt-get update

sudo apt-get install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker

Install CalypsoAI Moderator

Step 1. Log in to the Harbor registry using the provided access credentials. You may omit the --password option to be prompted for entering a hidden value.

sudo docker login --username {CAI supplied username} --password {CAI supplied password} https://harbor.calypsoai.app

Step 2. Create a folder called moderator and enter it.

mkdir moderator

cd moderator

Step 3. Copy the code shown below into a docker-compose.yaml file.

version: '3.8'
services:
  moderator:
    container_name: cai-moderator
    restart: unless-stopped
    networks:
      - 'moderator'
    image: ${IMAGE}:${TAG}
    depends_on:
      postgres:
        condition: 'service_healthy'
    ports:
      - 5500:5500
      - 8080:8080
    environment:
      CAI_MODERATOR_BASE_URL: ${CAI_MODERATOR_BASE_URL}
      CAI_MODERATOR_SCANNER_SERVER_URL: ${CAI_MODERATOR_SCANNER_SERVER_URL}
      CAI_MODERATOR_DB_ADMIN_PASSWORD: ${CAI_MODERATOR_DB_ADMIN_PASSWORD}
      CAI_MODERATOR_DB_MODERATOR_PASSWORD: ${CAI_MODERATOR_DB_MODERATOR_PASSWORD}
      CAI_MODERATOR_DB_AUTH_PASSWORD: ${CAI_MODERATOR_DB_AUTH_PASSWORD}
      CAI_MODERATOR_EMAIL_USER: ${CAI_MODERATOR_EMAIL_USER}
      CAI_MODERATOR_EMAIL_PASSWORD: ${CAI_MODERATOR_EMAIL_PASSWORD}
      CAI_MODERATOR_EMAIL_HOST: ${CAI_MODERATOR_EMAIL_HOST}
      CAI_MODERATOR_AUTH: ${CAI_MODERATOR_AUTH}
  postgres:
    image: "timescale/timescaledb:2.10.3-pg15-bitnami"
    restart: unless-stopped
    networks:
      - 'moderator'
    container_name: cai-moderator-postgres
    ports:
      - 5432:5432
    environment:
      POSTGRES_USER: postgres
      POSTGRES_DB: postgres
      POSTGRES_PASSWORD: ${CAI_MODERATOR_DB_ADMIN_PASSWORD}
      PGDATA: /data/postgres
    volumes:
      - ./postgres/:/data/postgres
    healthcheck:
        test: [ "CMD-SHELL", "pg_isready -U postgres -d moderator" ]
        interval: 10s
        timeout: 5s
        retries: 5
networks:
  moderator:

If running on a GPU instance you are required to add the following under the moderator service:

deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]

With this being the case, the docker-compose.yaml will look like this:

version: '3.8'
services:
  moderator:
    container_name: cai-moderator
    restart: unless-stopped
    networks:
      - 'moderator'
    image: ${IMAGE}:${TAG}
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]
    depends_on:
      postgres:
        condition: 'service_healthy'
    ports:
      - 5500:5500
      - 8080:8080
    environment:
      CAI_MODERATOR_BASE_URL: ${CAI_MODERATOR_BASE_URL}
      CAI_MODERATOR_SCANNER_SERVER_URL: ${CAI_MODERATOR_SCANNER_SERVER_URL}
      CAI_MODERATOR_DB_ADMIN_PASSWORD: ${CAI_MODERATOR_DB_ADMIN_PASSWORD}
      CAI_MODERATOR_DB_MODERATOR_PASSWORD: ${CAI_MODERATOR_DB_MODERATOR_PASSWORD}
      CAI_MODERATOR_DB_AUTH_PASSWORD: ${CAI_MODERATOR_DB_AUTH_PASSWORD}
      CAI_MODERATOR_EMAIL_USER: ${CAI_MODERATOR_EMAIL_USER}
      CAI_MODERATOR_EMAIL_PASSWORD: ${CAI_MODERATOR_EMAIL_PASSWORD}
      CAI_MODERATOR_EMAIL_HOST: ${CAI_MODERATOR_EMAIL_HOST}
      CAI_MODERATOR_AUTH: ${CAI_MODERATOR_AUTH}
  postgres:
    image: "timescale/timescaledb:2.10.3-pg15-bitnami"
    restart: unless-stopped
    networks:
      - 'moderator'
    container_name: cai-moderator-postgres
    ports:
      - 5432:5432
    environment:
      POSTGRES_USER: postgres
      POSTGRES_DB: postgres
      POSTGRES_PASSWORD: ${CAI_MODERATOR_DB_ADMIN_PASSWORD}
      PGDATA: /data/postgres
    volumes:
      - ./postgres/:/data/postgres
    healthcheck:
        test: [ "CMD-SHELL", "pg_isready -U postgres -d moderator" ]
        interval: 10s
        timeout: 5s
        retries: 5
networks:
  moderator:

Step 4. Copy the code shown below into a .env file.

Note: Update values in angle brackets. The BASE URL is the domain name URL that you created ahead of time when defining DNS entries. Please see DNS section above.

Note: Moderator comes with Keycloak for authentication.

If you do not require authentication, you can set the `CAI_MODERATOR_AUTH` variable to false on installation. This will disable the authentication requirement.

If you change the default Keycloak password, add and set `CAI_MODERATOR_AUTH_ADMIN_PASSWORD` in the .env and docker-compose.yaml.

export IMAGE=harbor.calypsoai.app/calypsoai/cai_moderator
export TAG=<Moderator Docker image version, example: v3.51.0-full>
export CAI_MODERATOR_BASE_URL=<internal customer defined URL, example: "https://lighthouse.moderator.dev">
export CAI_MODERATOR_SCANNER_SERVER_URL=<endpoint your scanner will be available at via port 8000 via the path /v1, example: "http://10.1.1.1:8000/v1">
export CAI_MODERATOR_DB_ADMIN_PASSWORD=postgres
export CAI_MODERATOR_DB_MODERATOR_PASSWORD=moderator
export CAI_MODERATOR_DB_AUTH_PASSWORD=keycloak
export CAI_MODERATOR_EMAIL_USER=<email user>
export CAI_MODERATOR_EMAIL_PASSWORD=<email password>
export CAI_MODERATOR_EMAIL_HOST=<email host>
export CAI_MODERATOR_AUTH=true

Step 5. Pull the Moderator and PostgreSQL Docker images and run the containers. A full Moderator image can take 10 minutes to pull. A `postgres` subdirectory is created in the `moderator` parent directory, which contains the database to persist data between runs.

  sudo docker compose up -d

Step 6. Browse to the URL set in CAI_MODERATOR_BASE_URL and verify a login prompt is displayed.

Install CalypsoAI Scanner

Step 1. Log into the Harbor registry using the provided access credentials. You may omit the --password option to be prompted for entering a hidden value.

sudo docker login --username {CAI supplied username} --password {CAI supplied password} https://harbor.calypsoai.app

Step 2. Create a folder called "scanner" and enter it.

 mkdir scanner cd scanner

Step 3. Copy the code shown below into a docker-compose.yaml file.

services:
  cai-scanner:
    container_name: cai-scanner
    restart: unless-stopped
    networks:
      - 'scanner'
    image: ${SCANNER_IMAGE}:${SCANNER_TAG}
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: 1
            capabilities: [gpu]
    ports:
      - 8000:8000
    environment:
      MODEL: 'CalypsoAI/Phi-3-medium-128k-instruct-GPTQ-4-bit'
      QUANTIZATION: gptq
      DTYPE: auto
      MAX_MODEL_LEN: 31000
      GPU_MEMORY_UTILIZATION: 0.9
networks:
  scanner:

Step 4. Copy the code shown below into a .env file.

export SCANNER_IMAGE=harbor.calypsoai.app/calypsoai/cai_scanner
export SCANNER_TAG=<Moderator Scanner Docker image version, example: v0.0.1>

Step 5. Run the container via docker compose with the below command.

 sudo docker compose up -d

Troubleshooting

 # Verify running containers
sudo docker compose ls
sudo docker compose ps

# View logs
sudo docker compose logs [-f] [moderator, postgres]

# View container real time events
sudo docker compose events

# Check running processes
sudo docker compose top

Option B: Kubernetes Cluster

The following instructions assume:

Knowledge of Kubernetes and how to configure it as well as access rights to create resources as needed
Use of Kubernetes tools, such as kubectl, to run interactive commands
An existing, operational Kubernetes Cluster, with an ingress controller deployed
A Kubernetes node group created with a key value of "moderator" and "cai-scanner" for the custom scanner deployment (detailed below) (Optional)

Create a Kubernetes Namespace

Create the cai-moderator namespace.

kubectl create ns cai-moderator

If you are also installing custom scanners, create a second namespace called cai-scanner.

kubectl create ns cai-scanner

Setup an Ingress Controller

An ingress controller is a load balancer used for routing external traffic to your Kubernetes cluster and is responsible for L4-L7 Network services. The ingress controller or load balancer operates at Layer 7. Ingresses or the object route HTTP and HTTPS traffic.

Cloud-based ingress controllers include:

AKS Application Gateway Ingress Controller
AWS ALB Ingress controller
GCP GLBC/GCE Ingress Controller

Open-source ingress controllers include:

Voyager
F5
HAProxy
Istio
Kong
NGINX
Skipper

Moderator uses two ports, one for Keycloak user authentication and the other for the Moderator application itself. The ingress controller (load balancer) is used to route the /auth URI to port 8080 on the Kubernetes service and then / (root) to port 5500 on the Kubernetes service. The following diagram shows how this works with the Istio ingress controller.

The configuration steps below show an example how to do this with nginx. CalypsoAI can supply you with examples for both nginx, Istio ingress controllers and Nginx Ingress. Whichever ingress controller (load balancer) is chosen, the same principles apply.

If using the custom scanner functionality, the installation should look like below.

Kubernetes Node Affinity

Pod scheduling is one of the most important aspects of Kubernetes cluster management. How pods are distributed across nodes directly impacts performance and resource utilization. Kubernetes node affinity is an advanced scheduling feature that helps administrators optimize the distribution of pods across a cluster. Node affinity enables administrators to match pods according to the labels on nodes.

The deployment will optionally look for a node_group key with a value of moderator or cai-scanner on the Kubernetes node using node affinity. Moderator requires a large CPU and lots of memory or a GPU and cai-scanner requiring a GPU; therefore, it can be helpful to separate the required node from the node used for other applications inside a shared cluster. By default, this affinity is disabled and can be turned on with the following value in the values.yaml file.

affinity: true

By way of example, the underlying AWS image that CalypsoAI uses is a c6i.4xlarge image. The node group and affinity are used in the supplied Helm chart. This is to provide a different image type for the Moderator pod.

...
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: "node_group"
          operator: In
          values:
          - "moderator"
...

This affinity is optional (affinity: false) if you want to use the default node pool in the Kubernetes cluster, as long as that node pool is using the recommended hardware requirements.

The PostgreSQL installation uses the default CPU/memory resources along with 8Gi of persistent storage.

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"

GPU Operator

If you choose to use GPU's with Kubernetes, you will need to Install the Nvidia GPU operator in order to provide Kubernetes access to the GPU.

You can install this with a simple helm command:

 helm repo add nvidia https://nvidia.github.io/gpu-operator \
   && helm repo update

helm install --wait --generate-name \
     nvidia/gpu-operator

Install NVIDIA GPU Operator — gpu-operator 1.8 documentation

Install Helm Charts

CalypsoAI Moderator uses two Helm charts, a PostgreSQL chart and the Moderator application chart. If using custom scanners, there is a third Helm chart for the custom scanners.

Install the PostgreSQL Helm Chart

Replace the PostgreSQL password change_me with your desired password. Please do NOT use any wildcard characters. You will need to reference this password when you create a values.yaml file for the cai-moderator Helm chart installation.

helm install --set global.postgresql.auth.postgresPassword='change_me' --set image.repository=timescale/timescaledb --set image.tag=2.10.2-pg15-bitnami --set targetRevision=12.2.8 cai-moderator-postgres oci://registry-1.docker.io/bitnamicharts/postgresql -n cai-moderator

Install the CAI-Moderator Helm Chart

Step 1. Create a docker-registry secret called regcred for Harbor access to pull the cai-moderator Docker image and Helm chart. Replace cust_lighthouse and cust_password with the provided access credentials.

kubectl create secret docker-registry regcred -n cai-moderator --docker-server=harbor.calypsoai.app --docker-username=cust_lighthouse --docker-password="cust_password"

Step 2. Login to Harbor with your login credentials.

helm registry login harbor.calypsoai.app

Step 3. Create a values.yaml to use with the Helm chart.

The CAI_MODERATOR_BASE_URL environment variable must match the load balancer URL. This is the URL you will browse to access Moderator and must be prefixed with https://.

Please reach out to a CalypsoAI team member for setting up CAI_MODERATOR_EMAIL_USER and CAI_MODERATOR_EMAIL_PASSWORD. They are not required to proceed with the installation.

Moderator comes with Keycloak for authentication.

If you do not require authentication you can set the CAI_MODERATOR_AUTH variable to false on installation. This will disable the authentication requirement.

If you change the default Keycloak password, add and set CAI_MODERATOR_AUTH_ADMIN_PASSWORD in the secrets.

affinity: <true or false>
imagePullSecrets:
  - name: regcred
env:
  CAI_MODERATOR_BASE_URL: <load balancer URL>
  CAI_MODERATOR_AUTH: true
secrets:
  CAI_MODERATOR_DB_ADMIN_PASSWORD: "change_me"
  CAI_MODERATOR_EMAIL_USER: ""
  CAI_MODERATOR_EMAIL_PASSWORD: ""

Step 4. In your local working directory, run the cai-moderator helm chart installation with the values file filled in.

helm install cai-moderator oci://harbor.calypsoai.app/calypsoai/cai-moderator --version 1.0.21 -n cai-moderator -f values.yaml

Install the CAI-Scanner Helm Chart

Reminder: The cai-scanner application needs its own dedicated GPU to run. For details on compatible instances, please review the requirements for a GPU instance in the Hardware section above.

Step 1. As done above for the cai-moderator namespace, create a docker-registry secret called regcred for Harbor access to pull the cai-scanner Docker image and Helm chart. Replace cust_lighthouse and cust_password with the provided access credentials.

kubectl create secret docker-registry regcred -n cai-scanner --docker-server=harbor.calypsoai.app --docker-username=cust_lighthouse --docker-password="cust_password"

Step 2. Login to Harbor with your login credentials.

helm registry login harbor.calypsoai.app

Step 3. Create a values.yaml to use with the Helm chart.

imagePullSecrets:
  - name: regcred

Step 4. In your local working directory, run the cai-scanner helm chart installation with the values file filled in.

helm install cai-scanner oci://harbor.calypsoai.app/calypsoai/cai-scanner --version 1.0.6 -n cai-scanner -f values.yaml

Creating Users

Step 1. Browse to the application and log in with the default credentials below:

Username: admin
Password: pass

Step 2. Click Settings in the navigation panel.

Step 3. Click the Users tab. The Users table opens. Click Invite User.

Step 4. A pop-up window opens. Enter the invitee’s name and email address, and select their role from the Role drop-down list. Click Email Invite to send the invitation via email, or click Copy invite link to share the link with the invitee. The invited user will use the link to complete their account setup.

The user will have an Invited status until they complete their account setup by following the invite link to set their password. Once that is complete, the Invited label will no longer be visible on the user’s account and the user will be able to access the platform.

Configuring Identity Provider for Single Sign-On

*Optional but recommended

Identity providers (IdPs) allow CalypsoAI Moderator to use existing external accounts for authentication. Moderator supports multiple IdPs. Please refer to the specific IdP documentation.

The below example uses a Google SSO setup.

The following instructions describe how to set up a project using Google as an IdP and how to retrieve the client ID and secret for CalypsoAI Moderator. After setup, users may log into CalypsoAI Moderator using their Google account.

Step 1. Create a project in Google Cloud Platform (GCP):

Open your browser and navigate to:

https://console.cloud.google.com/

and follow the prompts to create a new project.

Step 2. Configure a consent screen in Google Cloud Platform (GCP)

Open your browser and navigate to:

https://console.cloud.google.com/apis/credentials/consent

and follow the prompts to configure a consent screen.

Step 3. Establish your credentials.

Open your browser and navigate to:

https://console.cloud.google.com/apis/credentials

Click + CREATE CREDENTIALS and select OAuth client ID
Choose Web application as Application type and give it a name.
Enter the following URI under Authorized redirect URIs:
{your_domain}/auth/realms/calypsoai/broker/google/endpoint
Click CREATE

Enabling Custom Scanners

To enable the custom scanners feature, you will need to use the Moderator API. First create an API key if you do not have one as described in the “Generate Moderator API Token below.

In a terminal, run the below command after replacing the placeholder values {token} with your API key and {moderator_url} with the url that points to your Moderator installation.

curl -X PATCH -H "Authorization: Bearer {token}“ -H 'Content-Type: application/json' ‘{moderator_url}/backend/v1/internal/features' --data-raw '{"orgId": "calypsoai", "values": {"custom_scanners": {"enabled": true}}}'

You should see a response that says {"message":"Features updated for org"}. Custom scanners are now enabled for your on prem Moderator installation.

API Documentation

Once Moderator is installed, access the internal API documentation by navigating to the following address:

https://{url of moderator}/docs

API Example

The following instructions describe the steps required to interact with the CalypsoAI Moderator API, and provides simple use-case examples of interacting with the API.

Generate Moderator API Token

To access CalypsoAI Moderator via the API, generate and retrieve the API token, which requires admin permission.

Step 1. Log into the CalypsoAI Moderator environment via the UI.

Step 2. Navigate to User Profile, clicking on user profile icon at the bottom left of the screen.

Step 3. In the API Key Token section, Click Generate API Token.

Step 4. Enter a name for the API Token and an expiration date in the text boxes in the

popup window. Click Save.

Step 5. The API token appears on the screen.

Note: Store the token in a secure place. Once created, the token is not accessible from the Moderator system.

Step 6. Use the API Token to interact with CalypsoAI Moderator via the API.

CalypsoAI Moderator API Examples

Send a prompt to an LLM.
Retrieve all existing prompts.
Convert a prompt into a conversational chat.

Send a Prompt to an LLM

import requests

token = "<YOUR_API_TOKEN>"
url = "<YOUR_MODERATOR_BASE_URL>" # e.g. https://company.moderator.com This is the URL you use to log into Moderator via your browser

# We pass the API token in the authorization header of every request
headers = {"Authorization": f"Bearer {token}"}

(endpoint,) = requests.get(f"{url}/backend/v1/endpoints", headers=headers).json()["endpoints"]

# Use the default provider for the endpoint
provider_id = endpoint["config"]["providers"][0]

prompt = "What's the weather usually like in Antarctica?"

response = requests.post(
    f"{url}/backend/v1/prompts",
    headers=headers,
    json={
        "event": "prompt.create",
        "endpointId": endpoint["id"],
        "providerIds": [provider_id],
        "data": prompt,
    },
)

# Verify we received a successful response
if response.status_code != 200:
    print(response)
    exit(1)

response_json = response.json()
# Extract the prompt id from the response, we can use this later to fetch the prompt data again
prompt_id, result = response_json["id"], response_json["result"]

if result["outcome"] == "blocked":
    print("Prompt blocked")
    print(result)
    exit(1)

print("Prompt ID: ", prompt_id)
print(result["providerResults"][provider_id]["data"])

# Get a particular prompt
print(requests.get(f"{url}/backend/v1/prompts/{prompt_id}", headers=headers).json())

Retrieve All Prompts

Retrieve all prompts and store them in a JSON Lines file.

import requests  
Import json  

token = "<YOUR_API_TOKEN>"  
url = "<YOUR_MODERATOR_BASE_URL>" # e.g. https://company.moderator.com  

headers = {"Authorization": f"Bearer {token}"}  

With open("prompts.jsonl", "w") as f:  
	last_prompt_id = None  
	while prompts := requests.get(  
		f"{url}/backend/v1/prompts",  
		headers=headers,  
		params={  
			"before": last_prompt_id,  
			"limit": 100,  
			"onlyUser": False,  
		},  
	).json()["prompts"]:  
	for prompt in prompts:  
		f.write(json.dumps(prompt) + "\n")  
	last_prompt_id = prompts[-1]["id"]

Convert Prompt to a Conversational Chat

Some models, such as ChatGPT-4, provide conversational chat functionality. The script below shows how the CalypsoAI Moderator API could be leveraged to create a chat cli (command-line interface) to enable interaction with the model via a terminal. Multiple options can be selected.

# Get all prompts and store them in a file in JSON lines format

import requests  

token = "<YOUR_API_TOKEN>"  
url = "<YOUR_MODERATOR_BASE_URL>" # e.g. https://company.moderator.com This is the URL you use to log into Moderator via your browser  

headers = {"Authorization": f"Bearer {token}"}  

(endpoint,) = requests.get(f"{url}/backend/v1/endpoints", headers=headers).json()["endpoints"]  

# Use the default provider for the endpoint  
provider_id = endpoint["config"]["providers"][0]  

memory = []  

while prompt := input("Prompt> "):  
	response = requests.post(  
	f"{url}/backend/v1/prompts",  
	headers=headers,  
	json={  
		"event": "prompt.create",  
		"endpointId": endpoint["id"],  
		"providerIds": [provider_id],  
		"data": prompt,  
		"memory": memory,  
	},  
)  

response_json = response.json()  

if response.status_code != 200:  
	print(response_json)  
	exit(1)  
prompt_id, result = response_json["id"], response_json["result"]  

if result["outcome"] == "blocked":  
	print("Prompt blocked")  
	print(result)  
	exit(1)  

memory.append(prompt_id)  
print(result["providerResults"][provider_id]["data"])

Document v.41

Installing the ChatBot (On-Premise)

CalypsoAI Moderator - Helm Chart Values

On-Premises Installation and Getting Started Guide

Introduction

Deployment Options

Architecture

CalypsoAI Components

System Requirements & Prerequisites

Hardware

Networking Requirements

SSL Certificates

Application Load Balancer

Creating an Application Load Balancer (ALB) in AWS

DNS Requirements

Large Language Model Provider

User Access Management

Installation

Option A: Simple Container Management

Install CalypsoAI Scanner

Troubleshooting

Option B: Kubernetes Cluster

Create a Kubernetes Namespace

Setup an Ingress Controller

Kubernetes Node Affinity

GPU Operator

Install Helm Charts

Install the PostgreSQL Helm Chart

Install the CAI-Scanner Helm Chart

Creating Users

Configuring Identity Provider for Single Sign-On

Enabling Custom Scanners

API Documentation

API Example

Generate Moderator API Token

CalypsoAI Moderator API Examples

Send a Prompt to an LLM

Retrieve All Prompts

Convert Prompt to a Conversational Chat