Agent Deploy

Deploy and configure agents to Cloud Run with auto-scaling.

Overview

The hyperfold agent deploy command packages and deploys agents to Google Cloud Run. Agents are automatically configured with Firestore connections, ACP endpoints, and scaling policies.

Deployments are serverless—you only pay for actual usage. Agents scale to zero when idle and auto-scale based on demand.

Quick Deploy

Deploy an agent with a single command:

bash

# Deploy a negotiator agent with defaults
$ hyperfold agent deploy --name="sales-bot-01" --type=negotiator
 
> [Build] Packaging agent...
> [Push] Uploading to Container Registry...
> [Deploy] Creating Cloud Run service...
> [Config] Setting up Firestore collections...
> [Network] Configuring VPC connector...
> [ACP] Registering ACP endpoints...
 
✓ Agent deployed successfully!
 
  Name:         sales-bot-01
  URL:          https://sales-bot-01-xyz.run.app
  ACP Manifest: https://sales-bot-01-xyz.run.app/.well-known/acp/manifest.json
  Status:       active
 
# Deploy from a directory with agent code
$ hyperfold agent deploy ./agents/my-negotiator/
 
> [Scan] Found agent.yaml configuration
> [Build] Building from Dockerfile...
> [Deploy] Deploying to us-central1...
 
✓ Agent deployed: my-negotiator

Configuration File

Define agent configuration in agent.yaml:

yaml

# agent.yaml - Agent configuration file
 
name: sales-bot-01
type: negotiator
version: "1.2.0"
 
# LLM configuration
model:
  provider: openai
  model: gpt-4o
  temperature: 0.7
  max_tokens: 2048
 
# System prompt (inline or file reference)
system_prompt: |
  You are a sales agent for Acme Sports.
  Be helpful and friendly while protecting margins.
  Never go below the floor price.
  Suggest bundles when appropriate.
 
# Or reference a file
# system_prompt_file: ./prompts/negotiator.txt
 
# Pricing policy
pricing:
  policy_file: ./pricing-policy.yaml
  max_discount_percent: 20
  approval_threshold: 100
 
# Capabilities
capabilities:
  - negotiate
  - bundle
  - recommend
 
# Connections
integrations:
  catalog: default
  payments: stripe
  crm: salesforce
 
# Resource limits
resources:
  memory: 512Mi
  cpu: "1"
  max_instances: 10
  min_instances: 1
 
# Environment variables
env:
  LOG_LEVEL: info
  ENABLE_REASONING_LOG: "true"

Deployment Options

Customize deployment with command-line options:

bash

# Deploy with specific configuration
$ hyperfold agent deploy ./agents/sales-bot/ \
  --env=production \
  --region=us-central1
 
# Deploy with custom resource limits
$ hyperfold agent deploy --name="high-traffic-bot" \
  --type=negotiator \
  --memory=1Gi \
  --cpu=2 \
  --max-instances=50
 
# Deploy with environment variables
$ hyperfold agent deploy --name="sales-bot" \
  --type=negotiator \
  --set-env="LOG_LEVEL=debug" \
  --set-env="FEATURE_FLAG=enabled"
 
# Dry run to preview deployment
$ hyperfold agent deploy ./agents/sales-bot/ --dry-run
 
> [Preview] Would deploy agent with:
>   Name: sales-bot-01
>   Region: us-central1
>   Memory: 512Mi
>   Instances: 1-10 (auto-scale)
>   Estimated cost: $0.024/hour (at min scale)
 
# Deploy without waiting
$ hyperfold agent deploy ./agents/sales-bot/ --no-wait
 
> [Started] Deployment initiated. Check status with:
>   hyperfold agent get sales-bot-01

Deployment Environments

Environment	Use Case
`development`	Local testing, verbose logging
`staging`	Pre-production testing with production-like config
`production`	Live traffic, optimized settings

Scaling & Resources

Configure auto-scaling and resource allocation:

bash

# Configure auto-scaling
$ hyperfold agent scale sales-bot-01 \
  --min-instances=2 \
  --max-instances=20 \
  --target-cpu=70 \
  --target-memory=80
 
> [Scale] Updating scaling configuration...
✓ Scaling updated
 
  Min Instances:    2
  Max Instances:    20
  Scale-up Target:  CPU >70% or Memory >80%
  Scale-down Delay: 5 minutes
 
# Manual scaling
$ hyperfold agent scale sales-bot-01 --instances=5
 
> [Scale] Setting fixed instance count...
✓ Scaled to 5 instances
 
# View current scaling status
$ hyperfold agent scale sales-bot-01 --status
 
SCALING STATUS: sales-bot-01
  Mode:             auto
  Current:          3 instances
  Min:              2
  Max:              20
  CPU Utilization:  45%
  Memory:           62%
  Pending Scale:    none

Resource Presets

bash

# Resource allocation presets
$ hyperfold agent deploy --name="light-bot" --preset=small
  # 256Mi memory, 0.5 CPU, max 5 instances
 
$ hyperfold agent deploy --name="standard-bot" --preset=medium
  # 512Mi memory, 1 CPU, max 10 instances
 
$ hyperfold agent deploy --name="heavy-bot" --preset=large
  # 2Gi memory, 2 CPU, max 50 instances
 
# Custom resource configuration in agent.yaml
resources:
  memory: 1Gi
  cpu: "2"
  max_instances: 20
  min_instances: 2
 
  # Auto-scaling triggers
  scaling:
    target_cpu_percent: 70
    target_memory_percent: 80
    scale_down_delay: 300s
 
  # Concurrency per instance
  concurrency:
    max_concurrent_requests: 80
    request_timeout: 60s

Rollback & Versioning

Manage deployment versions and rollbacks:

bash

# List deployment versions
$ hyperfold agent versions sales-bot-01
 
VERSION   DEPLOYED           STATUS     TRAFFIC
v3        2025-12-19 10:30   active     100%
v2        2025-12-18 14:00   available  0%
v1        2025-12-15 09:00   available  0%
 
# Rollback to previous version
$ hyperfold agent rollback sales-bot-01 --version=v2
 
> [Traffic] Shifting traffic to v2...
> [Verify] Health check passing...
✓ Rolled back to v2
 
# Gradual rollout (canary deployment)
$ hyperfold agent deploy ./agents/sales-bot/ --canary=10
 
> [Deploy] Deploying as v4...
> [Traffic] Routing 10% of traffic to v4...
✓ Canary deployment active
 
  Version v3: 90% traffic
  Version v4: 10% traffic (canary)
 
# Promote canary to full traffic
$ hyperfold agent promote sales-bot-01
 
> [Traffic] Shifting 100% traffic to v4...
✓ v4 promoted to production
 
# Abort canary and rollback
$ hyperfold agent rollback sales-bot-01 --abort-canary

Always test new deployments with canary releases in production. This limits the blast radius if issues occur.

Configure agent behavior with agent prompt.