DOC-DC-003 Rev 1.0 May 2026

PersonaPlex on Kubernetes
Lean, Practical AI Infrastructure Design

A real-world Kubernetes + Helm architecture for PersonaPlex-class conversational AI. Starts on a single GPU machine. Scales incrementally. No over-engineering. Authored by ManjuLAB Infrastructure Engineering.

Core Design Philosophy

Start with 1 GPU machine running K3s. Add complexity only when you have the users to justify it. The GPU is the only expensive unit -- everything else (K3s, vLLM, Qdrant, Redis) is open source and free. No Kafka, no microservice explosion, no enterprise waste.