Billing & Cost Analysis

Track spending, set budgets, and optimize your Hyperfold costs.

Overview

The billing dashboard provides complete visibility into your Hyperfold spending. Track costs by component, set budget alerts, and get AI-powered optimization recommendations.

Costs are updated hourly. For real-time spend tracking, set up budget alerts with low thresholds.

Cost Breakdown

bash

# View current billing summary
$ hyperfold billing summary
 
BILLING SUMMARY: January 2025
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
PERIOD: Jan 1 - Jan 20, 2025
 
TOTAL SPEND                              $4,892.50
├─ LLM Inference                         $2,450.00  (50%)
├─ Agent Compute                         $1,200.00  (25%)
├─ Storage & Database                    $480.00    (10%)
├─ Network & Bandwidth                   $320.00    (7%)
├─ Integrations                          $242.50    (5%)
└─ Support & Services                    $200.00    (4%)
 
PROJECTED MONTH-END                      $7,645.00
BUDGET                                   $8,000.00
STATUS                                   ✓ On track
 
# Detailed breakdown
$ hyperfold billing breakdown --period=mtd
 
COST BREAKDOWN (Month to Date)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
LLM INFERENCE                            $2,450.00
  OpenAI GPT-4-Turbo        4.2M tokens  $1,680.00
  OpenAI GPT-4o             1.8M tokens  $540.00
  Embeddings                12M tokens   $230.00
 
AGENT COMPUTE                            $1,200.00
  sales-negotiator          142 hrs      $710.00
  fulfillment-agent         68 hrs       $340.00
  recommender-agent         30 hrs       $150.00
 
STORAGE                                  $480.00
  Vector Database           45 GB        $225.00
  Document Storage          120 GB       $180.00
  Logs & Analytics          50 GB        $75.00
 
INTEGRATIONS                             $242.50
  Shopify API calls         45,000       $90.00
  Stripe API calls          12,000       $72.00
  ShipStation               8,000        $80.50

Cost Components

Component	What's Included
LLM Inference	GPT-4 tokens, embeddings, reasoning
Agent Compute	Container runtime, CPU, memory
Storage	Vector DB, documents, logs
Integrations	External API calls, webhooks

LLM Costs

LLM inference is typically the largest cost component. Analyze token usage to optimize spending:

bash

# Detailed LLM usage analysis
$ hyperfold billing llm --since=7d
 
LLM USAGE (7 days)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
TOTAL TOKENS: 8.4M                       COST: $1,120.00
 
BY MODEL
  GPT-4-Turbo               5.2M tokens  $832.00   (74%)
  GPT-4o                    2.4M tokens  $216.00   (19%)
  text-embedding-3-large    0.8M tokens  $72.00    (6%)
 
BY AGENT
  sales-negotiator          6.1M tokens  $890.00   (79%)
  recommender-agent         1.8M tokens  $180.00   (16%)
  fulfillment-agent         0.5M tokens  $50.00    (4%)
 
BY OPERATION
  Negotiation reasoning     4.2M tokens  $672.00
  Product search            1.5M tokens  $135.00
  Quote generation          1.2M tokens  $168.00
  Customer context          0.9M tokens  $81.00
  Other                     0.6M tokens  $64.00
 
EFFICIENCY METRICS
  Avg tokens/session:       847
  Avg tokens/conversion:    2,541
  Cost/conversion:          $0.34
  Sessions/dollar:          3.8
 
# Per-session LLM costs
$ hyperfold billing llm --session=sess_abc123
 
SESSION: sess_abc123
  Duration:       32.5s
  Tokens:         1,247
  Cost:           $0.20
  Outcome:        conversion ($155.00)
  ROI:            775x

Budget Alerts

bash

# Configure budget alerts
$ hyperfold billing budget set \
  --monthly=8000 \
  --alert-threshold=80
 
Budget configured:
  Monthly limit:    $8,000
  Alert at:         80% ($6,400)
  Current spend:    $4,892.50 (61%)
 
# Set component-specific budgets
$ hyperfold billing budget set \
  --component=llm \
  --monthly=3000 \
  --alert-threshold=90
 
$ hyperfold billing budget set \
  --component=compute \
  --monthly=2000 \
  --alert-threshold=85
 
# View budget status
$ hyperfold billing budget status
 
BUDGET STATUS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
COMPONENT       BUDGET      SPENT       REMAINING   STATUS
Overall         $8,000      $4,893      $3,107      ✓ 61%
LLM             $3,000      $2,450      $550        ⚠ 82%
Compute         $2,000      $1,200      $800        ✓ 60%
Storage         $800        $480        $320        ✓ 60%
 
ALERTS
  ⚠ LLM spend at 82% of budget
    Projected month-end: $3,850 (128% of budget)
    Recommendation: Review token usage or increase budget
 
# Budget alert notification settings
$ hyperfold billing budget alerts \
  --channels="slack:#finance,email:billing@company.com" \
  --frequency=daily

Cost Optimization

bash

# Get cost optimization recommendations
$ hyperfold billing optimize
 
COST OPTIMIZATION RECOMMENDATIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
1. SWITCH TO GPT-4O-MINI FOR SIMPLE TASKS
   Current: Using GPT-4-Turbo for all operations
   Analysis: 35% of requests are simple lookups
   Recommendation: Route simple queries to GPT-4o-mini
   Estimated savings: $420/month (17%)
 
2. ENABLE RESPONSE CACHING
   Current: No caching for similar queries
   Analysis: 12% query similarity detected
   Recommendation: Enable semantic caching
   Estimated savings: $180/month (7%)
 
3. OPTIMIZE AGENT SCALING
   Current: Min instances = 2 (24/7)
   Analysis: 2 AM - 6 AM traffic < 10 req/min
   Recommendation: Reduce min to 1 during off-hours
   Estimated savings: $150/month (6%)
 
4. REDUCE EMBEDDING DIMENSIONS
   Current: Using 3072-dimension embeddings
   Analysis: 1536 dimensions sufficient for your data
   Recommendation: Switch to smaller embeddings
   Estimated savings: $60/month (2%)
 
TOTAL POTENTIAL SAVINGS: $810/month (33%)
 
Apply all recommendations? [Y/n]
 
# Compare costs across periods
$ hyperfold billing compare --period1=dec --period2=jan
 
COST COMPARISON: December vs January
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
COMPONENT       DECEMBER    JANUARY     CHANGE
LLM             $2,100      $2,450      +$350 (+17%)
Compute         $980        $1,200      +$220 (+22%)
Storage         $420        $480        +$60 (+14%)
Total           $3,500      $4,130      +$630 (+18%)
 
DRIVERS
  + Session volume up 25%
  + Avg tokens/session up 8%
  - Compute efficiency improved 5%
 
REVENUE COMPARISON
  December revenue:   $156,000
  January revenue:    $198,000 (+27%)
 
  Cost as % of revenue:
  December: 2.2%
  January:  2.1% (improved)

Optimization Strategies

Model Selection

Use smaller, faster models for simple tasks. Route complex reasoning to GPT-4 only when needed.

Response Caching

Cache responses for semantically similar queries. Reduces token usage without affecting quality.

Prompt Optimization

Shorter, more focused prompts use fewer tokens. Review verbose system prompts for trimming opportunities.

Smart Scaling

Reduce minimum instances during off-peak hours. Use scheduled scaling for predictable traffic patterns.

For infrastructure scaling configuration, see Auto-Scaling.