Machine learning can now do some extraordinary things: it can understand the world, drive cars, write code, make art.
But, it is still extremely hard to use. Research is typically published as a PDF, with scraps of code on GitHub and weights on Google Drive (if you’re lucky!). It is near-impossible to take that work and apply it to a real-world problem, unless you are an expert.
We’re making machine learning accessible to everyone. People creating machine learning models should be able to share them in a way that other people can use, and people who want to use machine learning should be able to do it without getting a PhD.
With great power also comes great responsibility. We believe that with better tools and safeguards, we will make this powerful technology safer and easier to understand.
We're a bunch of hackers, engineers, researchers, and artists.
We obsess about the details of API design and the right words for things. We're defining how AI works so we'd better get it right.
We make fast and reliable infrastructure. That's what a good infrastructure product is. We're not afraid to build things from scratch to make it the fastest.
We use AI for work. We use AI for play. We find unexplored parts of the map and create new techniques ourselves. We open-source it all.
We build in public, for the community. We want AI to work like open-source software so everyone benefits from it.
We're led by engineers. We all write code. (Or, we get ChatGPT to help.) There aren’t any meetings about meetings.
We've worked at places like Docker, Dropbox, GitHub, Heroku, NVIDIA, Scale AI, and Spotify. We've created technologies like Docker Compose and OpenAPI.
We're here to build a big company. We're ambitious and hard-working. We're not here to just build nice things.
At Replicate, we believe AI shouldn’t be exclusive to tech giants — it should be accessible to every software developer. Our goal is straightforward: build the best platform for creating, deploying, and running machine learning models. As an Infrastructure Engineer on the Platform team, you’ll play a key role in making generative AI available to everyone.
The Platform team at Replicate oversees the entire lifecycle of models, from packaging and deployment to serving, scaling, and monitoring. You’ll be developing the infrastructure that supports thousands of models and powers millions of predictions daily. This is a chance to build something truly innovative, where each decision you make has a tangible impact and allows your creativity to shine.
What you’ll be doing:
We're looking for the right person, not just someone who checks boxes, but, it’s likely you have…
These aren’t hard requirements, but we definitely want to talk with you if…
This role can be remote (anywhere in the United States) or in-person. We have a strong preference for people in PST. If possible, we like people to come into our San Francisco office at least 3 days a week.
We have a web product (currently React + Django), an open source CLI (Go + Python), and Kubernetes ML serving infrastructure.
fulltimeSan Francisco, CA, USFull stack$180K - $250K6+ years
fulltimeSan Francisco, CA, US$170K - $270K11+ years
fulltimeSan Francisco, CA, US / Remote (US; Los Angeles, CA, US; Seattle, WA, US)Backend$200K - $280K3+ years
fulltimeSan Francisco, CA, USBackend$200K - $280K3+ years
fulltimeSan Francisco, CA, US / Remote (US; GB)Backend$130K - $210K3+ years
fulltimeSan Francisco, CA, US$150K - $220K3+ years