Senior AI Platform Engineer (Kubernetes & LLMOps)
Pelham Berkeley SearchJob Description
Job Title: Senior AI Platform Engineer (Kubernetes & LLMOps)
Location: New York, NY
The Mandate We are building the enterprise-grade backbone that powers Generative AI across the firm. We need a Senior Platform Engineer to architect the internal "AI Factory"-the self-service platform that enables our quant, product, and tech teams to deploy GenAI agents and LLMs into production.
This is not a data science role. It is a hard-core engineering role focused on infrastructure, scalability, and developer experience within a highly regulated financial environment. You will bridge the gap between raw Kubernetes infrastructure and high-level AI application logic.
The Tech Stack
- Core: Python (FastAPI/Flask), Go.
- Infra: Kubernetes (OpenShift), Helm, Kustomize, Docker.
- AI/Data: LangChain, LangGraph, Vector Databases, Kafka, Redis.
- Ops: GitOps (ArgoCD), Jenkins, Prometheus, Grafana, OpenTelemetry.
What You Will Build
- The AI Platform: Architect reusable, containerized blueprints for deploying GenAI solutions. You will define how the firm deploys Agents, RAG pipelines, and fine-tuned models on OpenShift.
- Agentic Infrastructure: Design the orchestration layer for multi-agent workflows (using frameworks like LangGraph) to ensure they can run securely and reliably in production.
- Developer Experience: Abstract away the complexity of K8s and security compliance. Build the tooling and APIs that allow internal teams to spin up AI environments in minutes, not weeks.
- Enterprise Hardening: Solve for the specific constraints of NYC finance-implementing rigorous OAuth 2.0 flows, secret management, air-gapped model serving, and immutable audit logs.
- Observability: Implement deep tracing for LLM performance (token usage, latency, drift) using OpenTelemetry and Cortex.
The Candidate Profile
- Experience: 10+ years in software engineering with 5+ years dedicated to distributed systems or platform engineering.
- Industry Context: Experience working in Investment Banking, Fintech, or High-Frequency Trading environments. You understand that "speed" cannot come at the expense of "compliance."
- Kubernetes Mastery: You aren't just a user; you understand the internals of K8s, operators, and mesh networking.
- GenAI Reality: 1+ years of hands-on work with LLMs. You have moved beyond "Hello World" tutorials and understand the pain points of state management in LangChain or hosting embeddings at scale.
- Security First: Deep knowledge of RBAC, OIDC, and secure coding practices required for handling sensitive financial data.
Why This Role? You won't just be building chatbots; you will be defining the engineering standards for how a major financial institution adopts AI for the next decade.
Job Type
- Job Type
- Full Time
- Location
- New York, NY
Share this job:
