Back to docs

Virtual Machines

Deploy full virtual machines with dedicated GPU access, SSH connectivity, and flexible resource sizing.

Overview

VectorLay VMs give you a complete virtual machine with bare-metal GPU performance via VFIO passthrough. Unlike containers, VMs provide a full operating system where you have root access, can install any software, and run long-lived workloads. Each VM gets its own dedicated GPU(s), disk, and network endpoint.

Creating a VM

Create VMs from the VectorLay dashboard. Configure your VM with the following options:

nameA human-readable name for your VM. Must be unique within your organization.
gpu_typeGPU model to attach (e.g., RTX 4090, A100, H100). Leave empty for CPU-only VMs.
gpu_countNumber of GPUs to attach (1–8). Multi-GPU VMs support NVLink when available.
resource_sizeCompute tier (small, medium, large, xl) that determines CPU and RAM allocation.
templateBase OS image. Ubuntu 24.04 with Docker and NVIDIA drivers pre-installed by default.
disk_size_gbDisk size from 10 GB to 2,000 GB. Can be resized later (increase only).

OS Templates

VMs boot from pre-built QCOW2 images. The default template includes Ubuntu 24.04, Docker, and NVIDIA drivers. GPU VMs automatically use a GPU-enabled template with the appropriate driver version.

  • CPU VMs: Ubuntu 24.04 with Docker pre-installed
  • GPU VMs: Ubuntu 24.04 with Docker and NVIDIA drivers (570+)
  • Windows: Windows 10 with NVIDIA drivers and RDP access

Resource Sizes

The resource size controls how much CPU and RAM your VM gets. For GPU VMs, resources scale based on the total VRAM of your attached GPUs multiplied by the size factor.

SizeCPU-only specsGPU multiplierBest for
small1 CPU / 1 GB0.25xDevelopment, testing
medium1 CPU / 2 GB0.5xLight inference
large2 CPU / 4 GB1.0xStandard production
xl4 CPU / 8 GB2.0xHeavy workloads

Example: 1x H100 (80 GB VRAM) at large (1.0x) = 20 CPUs, 80 GB RAM. At xl (2.0x) = 40 CPUs, 160 GB RAM.

SSH Access

Connect to your VMs via the VectorLay SSH proxy. Traffic is routed securely over our WireGuard network — no public IP required on the VM itself.

terminal
# Connect to your VM via the VectorLay SSH proxy
ssh <vm-id>@ssh.vectorlay.com

Setting up SSH keys

  1. Add your SSH public key to your organization in the dashboard under Settings > SSH Keys
  2. Attach the key to your VM from the VM detail page
  3. Connect using ssh <vm-id>@ssh.vectorlay.com

Supported key types: ssh-ed25519, ssh-rsa, ecdsa-sha2-*, and FIDO2 security keys (sk-ssh-*).

Windows VMs & RDP

Windows VMs include Remote Desktop Protocol access. Connect using the in-browser RDP client available on the VM detail page in the dashboard, or use any standard RDP client with the connection details provided.

Running Containers Inside VMs

Optionally deploy a Docker container that starts automatically when the VM boots. This gives you the convenience of container-based deployment with the flexibility of a full VM underneath.

  • image_url — Docker image to pull and run (e.g., vllm/vllm-openai:latest)
  • container_port — Port the container listens on (default: 80)
  • env — Environment variables passed to the container
  • container_command — Override the container entrypoint
  • privileged — Run the container in privileged mode

Private registry images are supported — add your credentials in the dashboard and select them when creating the VM.

Custom Boot Commands

Run custom shell commands at VM boot using cloud-init. Commands execute as root after the base system is configured.

cloud-init
# Example: install custom packages and download a dataset at boot
cloud_init_runcmd:
  - "apt-get update && apt-get install -y ffmpeg"
  - "pip install transformers accelerate"
  - "wget -O /data/model.bin https://example.com/model.bin"

You can specify up to 50 commands, each up to 1,000 characters. Commands run sequentially on every boot.

Health Checks

Enable health checks to monitor your VM and route traffic only to healthy instances. Three check types are available:

http

Polls an HTTP endpoint and expects a 200 response. Best for web services.

tcp

Checks if a port is accepting connections. Good for SSH or database workloads.

process

Runs a shell command and checks the exit code. Flexible for custom checks.

HTTP health check
# HTTP health check (default)
health_check_enabled: true
health_check_type: "http"
health_check_path: "/health"
health_check_port: 8000
health_check_start_period: 900  # seconds
TCP health check
# TCP health check
health_check_enabled: true
health_check_type: "tcp"
health_check_port: 22

The default start period is 900 seconds (15 minutes) to allow time for GPU VMs with many devices to complete BIOS enumeration.

NVLink & Multi-GPU

For workloads that benefit from high-bandwidth GPU interconnect, enable NVLink when creating a multi-GPU VM. NVLink is available on nodes with NVSwitch hardware (typically full-node H100 allocations) and provides significantly higher GPU-to-GPU bandwidth compared to PCIe.

VM Lifecycle

Manage your VM through these lifecycle operations:

Stop

Shuts down the VM and releases its node assignment. No charges while stopped. Disk is preserved.

Restart

Boots a stopped VM back up. May be placed on a different node.

Reboot

Gracefully reboots a running VM in place without changing its node assignment.

Resize Disk

Increase disk size on a stopped VM (10–2,000 GB). Cannot be decreased.

Terminate

Permanently destroys the VM and its disk. This cannot be undone.

Endpoints & Networking

Each VM with a container gets a unique HTTPS endpoint at https://<name>-<id>.run.vectorlay.com. Traffic is routed through the VectorLay edge with TLS termination and forwarded to your container port over our WireGuard network.

Next steps