LiteLLM (https://github.com/BerriAI/litellm) is a Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere] and is used by companies like Rocket Money, Adobe, Twilio, and Siemens.
LiteLLM is an open-source LLM Gateway with 34K+ stars on GitHub and trusted by companies like NASA, Rocket Money, Samsara,
Lemonade, and Adobe. We're rapidly expanding and seeking our 6th Engineer focused on owning reliability, performance, and
infrastructure stability for the LiteLLM proxy.
LiteLLM provides an open source Python SDK and Python FastAPI Server that allows calling 100+ LLM APIs (Bedrock, Azure,
OpenAI, VertexAI, Cohere, Anthropic) in the OpenAI format.
We just hit $6M ARR and have raised a $1.6M seed round from Y Combinator, Gravity Fund and Pioneer Fund. You can find more
information on our website, Github and Technical Documentation.
Companies use LiteLLM Enterprise once they put LiteLLM into production and need enterprise features like Prometheus metrics
(production monitoring) and need to give LLM access to a large number of people with SSO (secure sign on) or JWT (JSON Web
Tokens).
Skills: Python, FastAPI, PostgreSQL, Redis, Kubernetes, Prometheus, performance profiling
As the SRE, you'll own the reliability and performance of the LiteLLM proxy in production. Our users run LiteLLM as a
critical gateway handling millions of LLM requests — when it goes down, their entire AI stack goes down. You'll work
directly with the CEO and CTO on critical projects including:
buffers in spend log transactions
extremely slowly, Prisma connection pool exhaustion
never released, Redis reset required), PodLockManager releasing another pod's lock, in-memory cache increment race
conditions
check fan-out overloading startup
Redis layers
improving observability for multi-pod deployments
proper health checks
The tech stack includes Python, FastAPI, Redis, Postgres, Prisma ORM, Kubernetes, Prometheus, Docker.
1-4 years of experience running Python services in production at scale
Experience debugging OOMs, memory leaks, connection pool issues, and race conditions
Comfortable with PostgreSQL (query optimization, connection pooling, PgBouncer) and Redis
Kubernetes experience — you've dealt with pod restarts, resource limits, health probes, and multi-replica coordination
Familiarity with Prometheus/Grafana for monitoring and alerting
Passion for open source and user engagement
Strong work ethic and ability to thrive in small teams
Eagerness to talk to users and help solve real problems — our GitHub issues are full of production debugging sessions and
you'd be jumping into those directly
Founding engineer will help with migrating key systems to aiohttp, handle LLM provider-specific quirks like Azure role handling, standardize LLMs to OpenAI Spec.
fulltimeSan Francisco, CA, USBackend$120K - $180K0.25% - 0.75%1+ years
fulltimeSan Francisco, CA, US$80K - $100KAny (new grads ok)
fulltimeSan Francisco, CA, USBackend$120K - $180K0.25% - 0.75%1+ years
fulltimeSan Francisco, CA, USBackend$150K - $200K0.25% - 0.75%1+ years
fulltimeSan Francisco, CA, US$80K - $120KAny (new grads ok)
fulltimeSan Francisco, CA, US$100K - $200K0.05% - 0.50%3+ years