Writing
Blog
Production AI systems, backend engineering at scale, and hard lessons from shipping ML infrastructure that actually works.
Filter
Featured
RAGpgvectorAI Infrastructure AI Generated
Production RAG with pgvector: What Nobody Tells You
After building RAG systems at scale, processing millions of embeddings with pgvector, here are the hard lessons about latency, cost, and correctness that most tutorials skip.
May 20258 min
Read MLOpsKubernetesCI/CD AI Generated
How We Reduced AI Service Deployment Time by 99.5%
Deploying a new AI inference service took two weeks. We cut that to 30 minutes. Here is the architecture, the tooling, and the lessons learned.
Feb 202512 min
Read All Posts
More writing on Medium
Articles on AI systems, backend engineering, and software craft.