Volra
v1.2.0 — Smart Sidecar

Deploy AI Agents
On Your Infrastructure

Open-source platform to deploy, monitor, and connect AI agents on your infrastructure. Agent mesh, cross-server routing, Docker & Kubernetes. Production-ready in 60 seconds.

$ curl -fsSL https://raw.githubusercontent.com/romerox3/volra/main/install.sh | sh
Homebrew: brew install romerox3/volra/volra
|
Source: git clone & make build
terminal
$ volra deploy

[1/5] Validating Agentfile...        OK
[2/5] Building Docker image...       OK
[3/5] Generating Compose stack...    OK
[4/5] Starting containers...         OK
[5/5] Waiting for health check...    OK

Agent deployed successfully!
───────────────────────────
Agent:        http://localhost:8000
Prometheus:   http://localhost:9090
Grafana:      http://localhost:3001
Alertmanager: http://localhost:9093

Everything You Need to Ship Agents

Production-grade tooling for the entire agent lifecycle. From init to monitoring.

Self-Hosted & Private

Your agents run on your infrastructure. No data leaves your servers. Full control over security, compliance, and costs.

Framework Agnostic

Works with LangChain, CrewAI, AutoGen, OpenAI Agents, and any Python/Node.js agent. No vendor lock-in.

One Command Deploy

From zero to production in under 60 seconds. Auto-generates Docker Compose, Prometheus, and Grafana from a simple Agentfile.

Control Plane

Centralized REST API with SQLite persistence. Manage all agents, view metrics, trigger deploys — all from one endpoint.

Web Console

Built-in dark-themed dashboard with real-time agent grid, search, status badges, and deploy/stop actions. No build step required.

Docker & Kubernetes

Deploy to Docker Compose or Kubernetes with a single flag. Auto-generated manifests with health probes and ServiceMonitors.

RBAC & API Keys

Role-based access control with admin, operator, and viewer roles. Bcrypt-hashed API keys with Bearer token authentication.

Built-in Observability

Monitor latency, token usage, error rates, and custom metrics. Pre-configured Grafana dashboards and Alertmanager integration.

Federation

Connect multiple Volra instances for cross-server agent visibility. Aggregated views with cached parallel queries.

Agent Mesh

A2A v0.3 agent cards, federated capability discovery, and cross-server tool routing. Your agents find and call each other across servers.

Smart Sidecar

Go reverse proxy replaces nginx for A2A task execution. Three modes: zero-config, declarative skill mapping, or full passthrough. Agents receive calls without code changes.

Three Steps to Production

No YAML engineering. No infra knowledge required. Just your agent code.

01

Describe Your Agent

Create an Agentfile or let Volra auto-detect your framework.

version: "1"
name: my-agent
framework: langchain
port: 8000
health_path: /health
alerts:
  slack:
    webhook_url_env: SLACK_WEBHOOK
02

Deploy Anywhere

Docker Compose or Kubernetes — one command for the full stack with monitoring.

$ volra deploy            # Docker
$ volra deploy --target k8s  # Kubernetes

[1/5] Validating Agentfile...    OK
[2/5] Building image...          OK
[3/5] Generating stack...        OK
[4/5] Starting services...       OK
[5/5] Health check...            OK
03

Manage at Scale

Control plane with web console, RBAC, federation, and Grafana dashboards.

$ volra server --port 4441

Console: http://localhost:4441
API:     /api/agents
Auth:    RBAC (admin/operator/viewer)
Peers:   3 federated servers

24 Templates to Get Started

Production-ready starters for every use case. Pick one and deploy in seconds.

basic FastAPI

Minimal FastAPI agent with health + ask endpoints

custom-agent Custom

Blank canvas with TODO stubs for your own agent logic

fastapi-bot FastAPI

SSE streaming chatbot with session memory

rag FastAPI+ChromaDB

RAG agent with ChromaDB + Redis cache

conversational FastAPI+OpenAI

Conversational agent with LLM, Redis + PostgreSQL

api-agent OpenAI SDK

Function-calling agent without any framework

mcp-server MCP Protocol

MCP-compatible tool server

openai-assistant OpenAI SDK

OpenAI Assistants API with threads and code interpreter

pgvector-rag pgvector

Hybrid search (vector + keyword) with pgvector

langgraph LangGraph

LangGraph ReAct agent with tool-calling loop

langchain-chatbot LangChain

LangChain chatbot with ConversationBufferWindowMemory

langchain-agent LangChain

LangChain AgentExecutor with ReAct tools

langchain-rag LangChain

LangChain RAG with ChromaDB and OpenAI embeddings

crewai CrewAI

CrewAI multi-agent research crew

crewai-team CrewAI

3-agent dev team (PM, Dev, QA) with CrewAI

crewai-researcher CrewAI

Single research agent with web scraping tools

openai-agents OpenAI Agents

OpenAI Agents SDK with tools and handoffs

openai-swarm OpenAI SDK

Multi-agent handoffs via function calling

smolagents HuggingFace

HuggingFace code agent with tool use

autogen-duo AutoGen

Two-agent coder + reviewer with AutoGen

autogen-group AutoGen

3+ agent group chat with approval flow

discord-bot Discord.py

AI-powered Discord bot with slash commands

slack-bot Slack Bolt

AI-powered Slack bot with event handling

web-chat WebSocket

Full-stack chat UI with WebSocket

What Volra Generates

One Agentfile in, a full production stack out. Deploy to Docker or Kubernetes.

Your Agentfile

# Agentfile
version: "1"
name: research-agent
framework: crewai
port: 8000
health_path: /health
env:
  OPENAI_API_KEY: $OPENAI_API_KEY
observability:
  level: 2
  metrics_port: 9090
alerts:
  slack:
    webhook_url_env: SLACK_WEBHOOK

Generated Stack

Your Agent

Docker container or K8s Deployment from your code

Prometheus

Metrics collection, alerting rules & ServiceMonitor

Grafana

Pre-built dashboards & visualizations

Alertmanager

Slack, email & webhook notifications

Control Plane

REST API + Web Console on port 4441

K8s Manifests

Deployment, Service, ConfigMap, PVC (with --target k8s)

Built in the Open

From CLI tool to production platform — every version ships real value.

v1.2

Smart Sidecar

Latest
  • Go reverse proxy replaces nginx (volra-proxy)
  • A2A Tasks/send → agent HTTP translation
  • Three modes: default, declarative, passthrough
  • Agentfile a2a section with skill mapping
  • Card enrichment with declarative skills
v1.1

Agent Mesh

  • A2A v0.3 agent cards with skills & auth
  • Federated capability discovery
  • Cross-server gateway routing (three-tier namespacing)
  • A2A task lifecycle (Tasks/send, get, cancel)
  • volra agents — unified mesh view
v1.0

Production Platform

  • Control Plane with REST API & SQLite
  • Web Console (htmx + Alpine.js)
  • Kubernetes manifests & kubectl apply
  • RBAC with API key authentication
  • Federation for multi-server visibility
v0.7

Governance Lite

  • Alertmanager (Slack, email, webhook)
  • Agent Marketplace with GitHub registry
  • Append-only audit trail
  • EU AI Act compliance docs
v0.6

Composition & MCP Gateway

  • MCP Gateway with tool routing
  • OpenTelemetry auto-instrumentation
  • Langfuse integration
  • A2A agent cards
v0.5

Observability & Evaluation

  • volra eval — local, framework-agnostic
  • Agent Hub for multi-agent dashboards
  • CrewAI framework detection
v0.4

Developer Loop

  • volra dev — hot-reload with Docker Compose watch
  • Homebrew tap & auto-update
  • --dry-run diff preview
v0.3

TUI Quickstart

  • Interactive template selector (Bubbletea)
  • 24 production-ready templates

Why Self-Hosted?

Keep your data, your costs, and your decisions under your control.

Volra (Self-Hosted) SaaS Platforms
Data Privacy 100% on your infra Shared cloud
Vendor Lock-in None — open source Platform-specific
Cost Model Your servers only Per-request pricing
Framework Support Any Python/Node.js Limited selection
Observability Prometheus + Grafana + Alertmanager Proprietary dashboards
Deploy Targets Docker & Kubernetes Managed containers
Access Control RBAC with API keys Platform accounts
Multi-Server Federation built-in Enterprise tier only
Customization Full source access Config options only

Ready to Own Your Agent Infra?

Join the community building the future of self-hosted AI agent deployment.