Deploy AI Agents
On Your Infrastructure
Open-source platform to deploy, monitor, and connect AI agents on your infrastructure. Agent mesh, cross-server routing, Docker & Kubernetes. Production-ready in 60 seconds.
$ curl -fsSL https://raw.githubusercontent.com/romerox3/volra/main/install.sh | sh
brew install romerox3/volra/volra git clone & make build $ volra deploy [1/5] Validating Agentfile... OK [2/5] Building Docker image... OK [3/5] Generating Compose stack... OK [4/5] Starting containers... OK [5/5] Waiting for health check... OK Agent deployed successfully! ─────────────────────────── Agent: http://localhost:8000 Prometheus: http://localhost:9090 Grafana: http://localhost:3001 Alertmanager: http://localhost:9093
Everything You Need to Ship Agents
Production-grade tooling for the entire agent lifecycle. From init to monitoring.
Self-Hosted & Private
Your agents run on your infrastructure. No data leaves your servers. Full control over security, compliance, and costs.
Framework Agnostic
Works with LangChain, CrewAI, AutoGen, OpenAI Agents, and any Python/Node.js agent. No vendor lock-in.
One Command Deploy
From zero to production in under 60 seconds. Auto-generates Docker Compose, Prometheus, and Grafana from a simple Agentfile.
Control Plane
Centralized REST API with SQLite persistence. Manage all agents, view metrics, trigger deploys — all from one endpoint.
Web Console
Built-in dark-themed dashboard with real-time agent grid, search, status badges, and deploy/stop actions. No build step required.
Docker & Kubernetes
Deploy to Docker Compose or Kubernetes with a single flag. Auto-generated manifests with health probes and ServiceMonitors.
RBAC & API Keys
Role-based access control with admin, operator, and viewer roles. Bcrypt-hashed API keys with Bearer token authentication.
Built-in Observability
Monitor latency, token usage, error rates, and custom metrics. Pre-configured Grafana dashboards and Alertmanager integration.
Federation
Connect multiple Volra instances for cross-server agent visibility. Aggregated views with cached parallel queries.
Agent Mesh
A2A v0.3 agent cards, federated capability discovery, and cross-server tool routing. Your agents find and call each other across servers.
Smart Sidecar
Go reverse proxy replaces nginx for A2A task execution. Three modes: zero-config, declarative skill mapping, or full passthrough. Agents receive calls without code changes.
Three Steps to Production
No YAML engineering. No infra knowledge required. Just your agent code.
Describe Your Agent
Create an Agentfile or let Volra auto-detect your framework.
version: "1"
name: my-agent
framework: langchain
port: 8000
health_path: /health
alerts:
slack:
webhook_url_env: SLACK_WEBHOOK Deploy Anywhere
Docker Compose or Kubernetes — one command for the full stack with monitoring.
$ volra deploy # Docker $ volra deploy --target k8s # Kubernetes [1/5] Validating Agentfile... OK [2/5] Building image... OK [3/5] Generating stack... OK [4/5] Starting services... OK [5/5] Health check... OK
Manage at Scale
Control plane with web console, RBAC, federation, and Grafana dashboards.
$ volra server --port 4441 Console: http://localhost:4441 API: /api/agents Auth: RBAC (admin/operator/viewer) Peers: 3 federated servers
24 Templates to Get Started
Production-ready starters for every use case. Pick one and deploy in seconds.
basic FastAPI Minimal FastAPI agent with health + ask endpoints
custom-agent Custom Blank canvas with TODO stubs for your own agent logic
fastapi-bot FastAPI SSE streaming chatbot with session memory
rag FastAPI+ChromaDB RAG agent with ChromaDB + Redis cache
conversational FastAPI+OpenAI Conversational agent with LLM, Redis + PostgreSQL
api-agent OpenAI SDK Function-calling agent without any framework
mcp-server MCP Protocol MCP-compatible tool server
openai-assistant OpenAI SDK OpenAI Assistants API with threads and code interpreter
pgvector-rag pgvector Hybrid search (vector + keyword) with pgvector
langgraph LangGraph LangGraph ReAct agent with tool-calling loop
langchain-chatbot LangChain LangChain chatbot with ConversationBufferWindowMemory
langchain-agent LangChain LangChain AgentExecutor with ReAct tools
langchain-rag LangChain LangChain RAG with ChromaDB and OpenAI embeddings
crewai CrewAI CrewAI multi-agent research crew
crewai-team CrewAI 3-agent dev team (PM, Dev, QA) with CrewAI
crewai-researcher CrewAI Single research agent with web scraping tools
openai-agents OpenAI Agents OpenAI Agents SDK with tools and handoffs
openai-swarm OpenAI SDK Multi-agent handoffs via function calling
smolagents HuggingFace HuggingFace code agent with tool use
autogen-duo AutoGen Two-agent coder + reviewer with AutoGen
autogen-group AutoGen 3+ agent group chat with approval flow
discord-bot Discord.py AI-powered Discord bot with slash commands
slack-bot Slack Bolt AI-powered Slack bot with event handling
web-chat WebSocket Full-stack chat UI with WebSocket
What Volra Generates
One Agentfile in, a full production stack out. Deploy to Docker or Kubernetes.
Your Agentfile
# Agentfile version: "1" name: research-agent framework: crewai port: 8000 health_path: /health env: OPENAI_API_KEY: $OPENAI_API_KEY observability: level: 2 metrics_port: 9090 alerts: slack: webhook_url_env: SLACK_WEBHOOK
Generated Stack
Your Agent
Docker container or K8s Deployment from your code
Prometheus
Metrics collection, alerting rules & ServiceMonitor
Grafana
Pre-built dashboards & visualizations
Alertmanager
Slack, email & webhook notifications
Control Plane
REST API + Web Console on port 4441
K8s Manifests
Deployment, Service, ConfigMap, PVC (with --target k8s)
Built in the Open
From CLI tool to production platform — every version ships real value.
Smart Sidecar
Latest- Go reverse proxy replaces nginx (volra-proxy)
- A2A Tasks/send → agent HTTP translation
- Three modes: default, declarative, passthrough
- Agentfile a2a section with skill mapping
- Card enrichment with declarative skills
Agent Mesh
- A2A v0.3 agent cards with skills & auth
- Federated capability discovery
- Cross-server gateway routing (three-tier namespacing)
- A2A task lifecycle (Tasks/send, get, cancel)
- volra agents — unified mesh view
Production Platform
- Control Plane with REST API & SQLite
- Web Console (htmx + Alpine.js)
- Kubernetes manifests & kubectl apply
- RBAC with API key authentication
- Federation for multi-server visibility
Governance Lite
- Alertmanager (Slack, email, webhook)
- Agent Marketplace with GitHub registry
- Append-only audit trail
- EU AI Act compliance docs
Composition & MCP Gateway
- MCP Gateway with tool routing
- OpenTelemetry auto-instrumentation
- Langfuse integration
- A2A agent cards
Observability & Evaluation
- volra eval — local, framework-agnostic
- Agent Hub for multi-agent dashboards
- CrewAI framework detection
Developer Loop
- volra dev — hot-reload with Docker Compose watch
- Homebrew tap & auto-update
- --dry-run diff preview
TUI Quickstart
- Interactive template selector (Bubbletea)
- 24 production-ready templates
Why Self-Hosted?
Keep your data, your costs, and your decisions under your control.
| Volra (Self-Hosted) | SaaS Platforms | |
|---|---|---|
| Data Privacy | 100% on your infra | Shared cloud |
| Vendor Lock-in | None — open source | Platform-specific |
| Cost Model | Your servers only | Per-request pricing |
| Framework Support | Any Python/Node.js | Limited selection |
| Observability | Prometheus + Grafana + Alertmanager | Proprietary dashboards |
| Deploy Targets | Docker & Kubernetes | Managed containers |
| Access Control | RBAC with API keys | Platform accounts |
| Multi-Server | Federation built-in | Enterprise tier only |
| Customization | Full source access | Config options only |
Ready to Own Your Agent Infra?
Join the community building the future of self-hosted AI agent deployment.