Setting up Google Vertex AI

The short instructions (Without images)

Step 1: Sign Up for Google Vertex AI

  1. Create a new project in Google Cloud.

Step 2: Enable LLaMA 3.1 API Services

  1. Navigate to the Model Garden within Google Vertex AI.
  2. Search for and enable the LLaMA 3.1 API services.

Step 3: Set Up Service Account Key

  1. Go to the IAM section in Google Cloud.
  2. Select your service account
  3. Create a new service account and enable the Vertex AI Administrator role
  4. Navigate to the “Keys” tab.
  5. Create a new key and save it securely for future use.

The Long instructions (With images)

  1. Create a new Google Cloud Project Create a new Google Cloud Project

  2. Click “NEW PROJECT” . Click “NEW PROJECT” .

  3. Give your project a name. Click “CREATE” . Give your project a name.Click “CREATE” .

  4. Select your project. Select your project.

  5. Click “llama-3-405b " link. Click “llama-3-405b " link.

  6. Click Search bar Click Search bar

  7. Search for and click “Vertex AI” . Search for and click “Vertex AI” .

  8. Click “Model Garden” . Click “Model Garden” .

  9. Click " SHOW ALL” . Click " SHOW ALL” .

  10. Click “ENABLE” . Click “ENABLE” .

  11. Click “CLOSE” . Click “CLOSE” .

  12. Click “Llama 3.1 API Service” . Click “Llama 3.1 API Service” .

  13. Click “ENABLE” . Click “ENABLE” .

  14. Click Hamburger menu . Click Hamburger menu .

  15. Click “IAM & Admin” . Click “IAM & Admin” .

  16. Click “Service Accounts” . Click “Service Accounts” .

  17. Click " CREATE SERVICE ACCOUNT" . Click " CREATE SERVICE ACCOUNT" .

  18. Give you it a name such as OpenAI API Give you it a name such as OpenAI API

  19. Click “CREATE AND CONTINUE” . Click “CREATE AND CONTINUE” .

  20. Select a role Select a role

  21. Search for Vertex.

  22. Click “Vertex AI Administrator” Search for Vertex.Click “Vertex AI Administrator”

  23. Click “CONTINUE” . Click “CONTINUE” .

  24. Click “DONE” . Click “DONE” .

  25. Click “your service account” link. Click “your service account” link.

  26. Click “KEYS” . Click “KEYS” .

  27. Click “ADD KEY " . Click “ADD KEY " .

  28. Click “Create new key” . Click “Create new key” .

  29. Click “CREATE” . Click “CREATE” .

  30. Download your key.

  31. Click “CLOSE” . Download your key.Click “CLOSE” .

Litellm setup

Installing LiteLLM

Since many applications are designed to support OpenAI-compatible APIs, you’ll need to set up LiteLLM Proxy. This proxy runs as a Docker container, translating requests between your application and the LLaMA 3.1 405B model on Google Cloud.

Install Docker and Docker Compose

  1. Ensure that Docker and Docker Compose are installed on your system. If you’re using Debian-based Linux, refer to my tutorial [here] for installation instructions.

Create Docker Volumes Create a Docker volume to store configurations with the following command:

sudo docker volume create litellm_db

Set Up Docker Compose

  1. Create a new Docker Compose file and paste the following configuration:
    services:
      litellm:
        image: ghcr.io/berriai/litellm:main-latest
        restart: unless-stopped
        ports:
          - 4000:4000
        depends_on:
          - litellm-db
        env_file:
          - .env
        environment:
          DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@litellm-db:5432/${POSTGRES_DB}
          LITELLM_MASTER_KEY: ${LITELLM_MASTER_KEY}
          UI_USERNAME: ${UI_USERNAME}
          UI_PASSWORD: ${UI_PASSWORD}
          STORE_MODEL_IN_DB: "True"
      litellm-db:
        image: postgres:16-alpine
        healthcheck:
          test:
            - CMD-SHELL
            - pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}
          interval: 5s
          timeout: 5s
          retries: 5
        volumes:
          - litellm_db:/var/lib/postgresql/data:rw
        env_file:
          - stack.env
        environment:
          POSTGRES_DB: ${POSTGRES_DB}
          POSTGRES_USER: ${POSTGRES_USER}
          POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
        restart: unless-stopped
    
    volumes:
      litellm_db:
        external: true
    
  2. Create a .env file with the following content:
    LITELLM_MASTER_KEY=sk-1234
    POSTGRES_DB=litellm
    POSTGRES_USER=litellm
    POSTGRES_PASSWORD=litellm
    UI_USERNAME=Fletcher
    UI_PASSWORD=Q6ryAJ!A7jC7UUaA*PadFatwxKUWXdR_
    litellm_db=litellm_db
    

Start Docker Compose Start the Docker Compose setup with the following command:

sudo docker compose up

Verify Docker Setup Check if everything is running correctly by using:

sudo docker ps

Configuring LiteLLM

Configure LiteLLM to Interface with Google Vertex AI

  1. Navigate to http://your_docker_ip:4000/ui.
  2. Create an admin account.
  3. In the “Models” panel, add a new model.
  4. Use the API key you saved earlier to configure the settings to match the image below LiteLLM-config-img

Verify Configuration Once saved, go to the /health tab to ensure everything is working correctly.

Using Your Setup You can now send OpenAI-compatible requests to your LiteLLM Proxy, which will translate and forward them to Google Vertex AI, returning a compatible response.

For instructions on setting up a ChatGPT-like web UI to use your new, free, and private access to LLaMA 3.1 405B, follow my guide here.