Developer Documentation

VectorLay API

Deploy and manage GPU inference clusters programmatically. The VectorLay REST API gives you full control over cluster lifecycle, scaling, and GPU availability.

Base URLhttps://api.vectorlay.com
AuthBearer vl_...
Content-Typeapplication/json

Get started

1

Get your API key

Create an account and generate an API key from your dashboard.

Create account
2

Create a cluster

Make a POST request to deploy a container onto GPU infrastructure.

Quick start
3

Send requests

Once healthy, your cluster endpoint is ready to receive inference traffic.

API reference

API overview

All endpoints use JSON request and response bodies. Authenticate every request with your API key in the Authorization header.

POST/v1/clusters

Create a cluster

Deploy a containerized workload onto GPU infrastructure. Specify the GPU type, number of replicas, container image, and environment variables. The cluster will be provisioned and assigned a unique endpoint URL.

nameA human-readable name for the cluster. Must be unique within your organization.
gpu_typeGPU model to use. See /v1/gpus for available types.
replicasNumber of GPU instances to provision. Traffic is load balanced across replicas.
containerDocker image to run. Supports public and private registries.
Request
curl -X POST https://api.vectorlay.com/v1/clusters \
  -H "Authorization: Bearer vl_xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-cluster",
    "gpu_type": "rtx-4090",
    "replicas": 1,
    "container": "vllm/vllm-openai:latest",
    "env": {
      "MODEL": "meta-llama/Llama-3.1-8B-Instruct"
    }
  }'
Response
{
  "id": "cl_abc123",
  "name": "my-cluster",
  "status": "provisioning",
  "gpu_type": "rtx-4090",
  "replicas": 1,
  "endpoint": "https://my-cluster-abc123.run.vectorlay.com",
  "created_at": "2026-01-15T10:30:00Z"
}
GET/v1/clusters

List clusters

Retrieve all clusters in your organization. Returns each cluster's current status, GPU type, replica count, and endpoint URL.

Request
curl https://api.vectorlay.com/v1/clusters \
  -H "Authorization: Bearer vl_xxx"
Response
{
  "data": [
    {
      "id": "cl_abc123",
      "name": "my-cluster",
      "status": "healthy",
      "gpu_type": "rtx-4090",
      "replicas": 1,
      "endpoint": "https://my-cluster-abc123.run.vectorlay.com"
    }
  ]
}