LiteLLM (https://github.com/BerriAI/litellm) is a Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere] and is used by companies like Rocket Money, Adobe, Twilio, and Siemens.
LiteLLM is an open-source LLM Gateway with 28K+ stars on GitHub and trusted by companies like NASA, Rocket Money, Samsara, Lemonade, and Adobe. We’re rapidly expanding and seeking a performance engineer to help scale the platform to handle 5K RPS (Requests per second). We’re based in San Francisco.
LiteLLM provides an open source Python SDK and Python FastAPI Server that allows calling 100+ LLM APIs (Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic) in the OpenAI format
We just hit $2.5M ARR and have raised a $1.6M seed round from Y Combinator, Gravity Fund and Pioneer Fund. You can find more information on our website, Github and Technical Documentation.
We're hiring a Python performance engineer to own maximizing throughput, minimizing latency and ensuring our platform is reliable in production.
Roadmap for Performance Engineer:
Founding engineer will help with migrating key systems to aiohttp, handle LLM provider-specific quirks like Azure role handling, standardize LLMs to OpenAI Spec.
fulltimeSan Francisco, CA, US / Remote (US)Backend$150 - $2000.50% - 3.00%Any (new grads ok)
fulltimeSan Francisco, CA, USBackend$160K - $220K0.50% - 3.00%1+ years
contractSan Francisco, CA, US / Remote$40K - $60K1+ years
fulltimeSan Francisco, CA, US$100K - $200K0.05% - 0.50%3+ years
fulltimeSan Francisco, CA, USFull stack$160K - $220K0.50% - 1.50%1+ years