DC Detailing — Bill of Materials

MANJULAB Ohio Data Center · PersonaPlex + LLM Brain + RAG · 5 TPS Scale
Complete hardware, software, and service inventory with costs and status tracking.

$43,400
Total CapEx (one-time)
$880/mo
Monthly OpEx
$10,560
Annual OpEx
$53,960
Year 1 Grand Total
38
Total Components
10
Architecture Layers
🖥️ Hardware — Infrastructure
Component Vendor / Model Qty Unit Cost Total Status Notes
GPU Server
NVIDIA A100 Node (48 GB VRAM)
Dell / Supermicro
PowerEdge R750xa + A100 PCIe
2 $15,000 $30,000 Planned 2x nodes for redundancy; primary PersonaPlex 7B INT8 inference
Rack Application Server
CPU App / Support Server (32-core, 128 GB RAM)
Dell
PowerEdge R550
2 $3,500 $7,000 Planned Hosts Redis, PostgreSQL, Prometheus, Grafana, Orchestrator
NAS Storage
Network Attached Storage (12-bay, 48 TB raw)
Synology
RS1221+ w/ 12x4TB HDD
1 $2,500 $2,500 Planned RAG documents, transcript logs, database backups
Core Switch
SFP+ 10GbE Core Switch
MikroTik
CRS326-24S+2Q+
1 $800 $800 Planned 10GbE fabric; upgrade to 25GbE at >10 TPS
Firewall / Edge Router
Edge Firewall, NAT, VLAN, VPN
Fortinet
FortiGate 60F
1 $700 $700 Planned TLS offload, DDoS protection, VLAN segmentation
UPS Power
Battery Backup Unit (per rack)
APC
Smart-UPS 3000VA RM 2U
2 $1,200 $2,400 Planned 15-min bridge per rack; auto-shutdown on extended outage
Internet Connectivity
Business Fiber ISP — MONTHLY OpEx
AT&T / Spectrum Business
Business Fiber 1 Gbps
1 $500 $500 Planned ** MONTHLY recurring; upgrade to 10 Gbps at >10 TPS
🎙️ Voice Layer
Component Vendor / Model Qty Unit Cost Total Status Notes
PersonaPlex 7B
INT8 Quantized LLM Voice Model
Meta / Community
PersonaPlex 7B INT8
1 $0 $0 Planned Full-duplex; 70ms speaker switch; runs on A100 GPU node
Mimi Encoder
Speech to Token Encoder (PCM 24kHz input)
Kyutai / Custom
Mimi Encoder v1
1 $0 $0 Planned Converts raw PCM audio frames to discrete speech tokens
Mimi Decoder
Token to Speech Decoder (PCM 24kHz output)
Kyutai / Custom
Mimi Decoder v1
1 $0 $0 Planned Synthesizes PCM speech output from token sequence
Temporal + Depth Transformer
Full-duplex Dual-stream Transformer
Custom
Temporal + Depth Transformer
1 $0 $0 Planned Simultaneous listen + speak; 70ms context switch
Text Prompt Injector
LLM Answer to Voice Prompt Injector
Custom
Python Prompt Injector
1 $0 $0 Planned Injects Brain Layer LLM answer each dialogue turn dynamically
🧠 Brain Layer (LLM API)
Component Vendor / Model Qty Unit Cost Total Status Notes
GPT-4o mini
Primary LLM API — 80% of traffic
OpenAI
GPT-4o mini (2024-07-18)
1 $50 $50 Active $0.15 in / $0.60 out per M tokens; majority of routine calls
Gemini 1.5 Flash
Budget LLM API — overflow / cheapest
Google DeepMind
Gemini 1.5 Flash
1 $30 $30 Active $0.075 in / $0.30 out per M tokens; lowest cost at volume
Claude Haiku
Quality LLM API — best quality/price ratio
Anthropic
Claude 3 Haiku
1 $50 $50 Active $0.25 in / $1.25 out per M tokens; nuanced quality tasks
GPT-4o / Claude Sonnet
Premium LLM API — 20% complex tasks
OpenAI / Anthropic
GPT-4o + Claude Sonnet 3.5
1 $200 $200 Active $2.50+ in / $10+ out per M tokens; complex reasoning
Smart LLM Router
Complexity-based LLM Request Router
Custom
Python Smart Router v1
1 $0 $0 Planned Routes 80% cheap / 20% premium based on prompt complexity score
📚 Knowledge Layer (RAG)
Component Vendor / Model Qty Total Status Notes
RAG Pipeline
Retrieval-Augmented Generation Orchestrator
Custom
LangChain / LlamaIndex
1$0 Planned Orchestrates embedding, retrieval, re-ranking, answer grounding
BGE-small Embedder
Text Embedding Model (384-dim vectors)
BAAI
BGE-small-en-v1.5
1$0 Planned Self-hosted; 33M params; CPU-inferrable at low latency
Qdrant / FAISS
Vector Database for Embedding Search
Qdrant / Meta FAIR
Qdrant CE / FAISS v1.7
1$0 Planned Stores 384-dim vectors; ANN search <50ms target
Re-ranker
Cross-encoder Result Re-ranker
HuggingFace / Custom
ms-marco-MiniLM-L-6-v2
1$0 Planned Improves top-K retrieval precision before answer injection
Document Store
Source Docs Ingest (PDF / FAQ / CRM / Webhooks)
Custom / MinIO
MinIO OSS (S3-compatible)
1$0 Planned Holds raw knowledge corpus; event-triggered re-indexing
🛠️ Support Services
ComponentVendor / ModelQty TotalStatusNotes
Redis
In-memory Cache & Session Layer
Redis Labs
Redis CE 7.2
1$0 Planned Session state, pub/sub audio events, rate-limit counters
PostgreSQL
Primary Relational Database
PostgreSQL Global Dev Group
PostgreSQL 16
1$0 Planned Users, transcripts, billing records, system configuration
Prometheus
Metrics Scraping & Alerting
CNCF
Prometheus 2.48
1$0 Planned Scrapes GPU util, latency, token usage; AlertManager integration
Grafana
Metrics Visualization & Ops Dashboard
Grafana Labs
Grafana CE 10.3
1$0 Planned Real-time dashboards: GPU%, p99 latency, LLM cost/min, margin
Transcript Logger
Conversation Analytics & Logging Service
Custom
Python FastAPI Logger
1$0 Planned Stores, indexes, and exports all call transcripts to PostgreSQL
Admin Dashboard
Knowledge-Base Management UI
Custom
React + FastAPI Admin
1$0 Planned Manage KB docs, model routing config, usage reporting
⚙️ Session Orchestrator
ComponentVendor / ModelQty TotalStatusNotes
Session Orchestrator
Audio Routing + GPU/LLM Orchestrator
Custom
Python 3.11 AsyncIO Service
1$0 Planned Routes PCM audio, intercepts monologue, injects LLM response dynamically
🌐 Gateway Layer
ComponentVendor / ModelQty TotalStatusNotes
nginx Gateway
TLS Termination + WebSocket Reverse Proxy
nginx Inc.
nginx OSS 1.25
1$0 Planned Rate limit: 5 concurrent conns; wss:// proxy; upgrade at scale
💻 Client Layer
ComponentVendor / ModelQty TotalStatusNotes
WebRTC / WebSocket
Real-time Browser Voice Interface
W3C / Custom JS
Browser WebRTC + WebSocket Stack
1$0 Active PCM 24kHz bidirectional; target <100ms glass-to-glass latency
Twilio SIP Trunk
PSTN / Phone-In Integration — MONTHLY OpEx
Twilio
Elastic SIP Trunking
1$50 Active ** MONTHLY; ~$0.01/min inbound PSTN; ~5k min/mo at 5 TPS
Web Voice Widget
Embeddable Browser Voice Widget
Custom
JS / React Embeddable Widget
1$0 Planned Drop-in <script> embed for client websites; mobile-responsive
🔒 Security
ComponentVendor / ModelQty TotalStatusNotes
Identity & JWT Auth
Authentication, Authorization & RBAC
Custom
Python-jose + FastAPI Auth
1$0 Planned JWT issuance, refresh, RBAC, per-tenant API key management
HashiCorp Vault
Secrets & API Key Manager
HashiCorp
Vault CE 1.15
1$0 Planned Stores LLM API keys, DB creds, TLS private keys securely
TLS Certificates
SSL/TLS Cert Automation (all layers)
Let's Encrypt / EFF
Certbot + ACME v2
1$0 Active Auto-renewing 90-day wildcard certs via DNS challenge; zero cost
🐳 Deployment
ComponentVendor / ModelQty TotalStatusNotes
Docker Compose
Dev & Staging Container Orchestration
Docker Inc.
Docker CE + Compose v2
1$0 Planned All services in docker-compose.yml; quick local iteration
Kubernetes (K3s)
Production Multi-node Container Orchestration
CNCF / Rancher Labs
K3s v1.29 / kubeadm
1$0 Planned HPA for voice pods; rolling deploys; upgrade to 25GbE at 10 TPS

💰 COST SUMMARY — MANJULAB Ohio DC · PersonaPlex + LLM Brain + RAG · 5 TPS

Cost Category Budget Type Amount Notes
Hardware — servers, switches, UPS, NAS (ex. ISP monthly) CapEx (one-time) $43,400 One-time hardware minus monthly ISP
Voice Layer — model weights (OSS / open source) CapEx (one-time) $0 Open-source weights; $0 license cost
Knowledge Layer — OSS RAG pipeline + vector DB CapEx (one-time) $0 LangChain / Qdrant / BGE-small — all OSS
Support Services — OSS stack (Redis, PG, Prometheus, Grafana) CapEx (one-time) $0 All open-source; $0 license
Session Orchestrator + Gateway + Client + Security + Deployment CapEx (one-time) $0 All OSS; dev labor cost separate
TOTAL CapEx (one-time hardware + setup) CapEx $43,400 Upfront investment for MANJULAB Ohio
── Monthly Operating Expenses ──
LLM APIs — GPT-4o mini + Gemini Flash + Claude Haiku + Premium OpEx (monthly) $330 Monthly API spend at 5 TPS; scales with volume
Internet / ISP — Business Fiber 1 Gbps OpEx (monthly) $500 Recurring monthly; upgrade to 10G at >10 TPS
Twilio SIP — PSTN inbound (monthly estimate) OpEx (monthly) $50 ~$0.01/min; ~5,000 min/mo at 5 TPS
TOTAL Monthly OpEx (APIs + ISP + Twilio) OpEx / month $880 Monthly recurring spend
TOTAL Annual OpEx (× 12 months) OpEx / year $10,560 Annual recurring spend estimate
YEAR 1 GRAND TOTAL (CapEx + Annual OpEx) Year 1 Total $53,960 Full Year 1: build-out + 12 months operations

🚀 Future Expansion Notes

  1. 25GbE Network Fabric: Upgrade MikroTik 10GbE to CRS354 or Arista 7050SX3 at >10 TPS or video workloads.
  2. GPU Redundancy: Add 2nd A100 node for active-active GPU redundancy at 10 TPS.
  3. Gateway Scale: Migrate nginx OSS to nginx Plus or HAProxy Enterprise at >500 concurrent sessions.
  4. ISP Upgrade: Upgrade to 10 Gbps symmetric fiber for low-latency multi-tenant streaming.
  5. GPU Operator: Add Kubernetes GPU Operator for dynamic scheduling across GPU nodes.
  6. Global CDN: Add Cloudflare CDN + TURN/STUN relay for global WebRTC client scale.