Site Reliability Engineer
PostHog is open source product analytics, built for developers. Automate the collection of every event on your website or app, with no need to send data to 3rd parties. It's a 1 click to deploy on your own infrastructure, with full API/SQL access to the underlying data.
We're growing rapidly, and always looking for people who can get things done.
Posthog is here to increase the number of successful products that exist in the world. Right now we are focusing on helping product creators understand how people are using their product. Where things go well. Where things don't go so well.
We have built a great service for understanding how users use products. Now we need to scale it and make sure it is rock solid. You will be joining a rapidly growing team helping us support the services that allow us to absorb billions of events without loss even during crazy peaks. You will also be helping out with keeping our query engine up during high loads. We also need to build out the infrastructure needed to deploy this stack on customer’s VPCs and equivalents so that they can maintain control over their data without any loss of quality of experience on Posthog.
Posthog is a well funded Y-Combinator and GV backed startup from the W20 batch and is growing rapidly. We love open source, data, and crafting beautiful products that are easy to use and provide clear value.
If you are interested in software infrastructure on Kubernetes, excited to work with an experienced software development team, this job is for you.
Here is what you will be working on:
- Building AWS, GCP, Azure (and more!) deployment automation for delivering Posthog to ourselves and customers
- Troubleshooting networking, compute, and Kubernetes failures.
- Improving performance and robustness of cloud and customer deployments.
- Hardening security all around.
- Working on large scale OLAP databases with huge datasets (we use Clickhouse and love it)
- Automate scaling Clickhouse for ourselves and our clients
- Improving how we manage and scale Kafka
What you'll bring:
- Strong experience in Linux systems administration, networks, performance troubleshooting.
- Running and troubleshooting Kubernetes clusters, containerized networking and applications.
- Intermediate knowledge of Golang, Rust, or Python.
- Infrastructure and application security engineering experience is a plus.
- Cloud infrastructure automation using Kubernetes
- Clickhouse experience is a plus
What to expect once you apply:
- We will send you a 30 minute SRE quiz
- You will join a 30 minute intro call to walk you through culture, compensation, the interview process, and requirements.
- Technical interview with the hiring team. This is usually 2 PostHog team members spending 45-60 minutes in conversation
- PostHog SuperDay - this is a paid day of working with us, which we will fit around your schedule and have you work on something SRE related
What we offer in return:
- Generous compensation
- Unlimited, permissionless vacation with a 25 day minimum
- Health insurance including dental and vision provided in the US and the UK
- Generous parental leave
- Visa sponsorship if needed, for you and your loved ones
- Training budget
- $200/month budget towards coworking or café working
- Carbon offsetting for work travel with Project Wren
- Free books Please note that benefits vary slightly by country. If you have any questions, please don't hesitate to ask our team.
Sold? Apply now
- Drop us a line and tell us:
- How you can achieve the above in a few sentences
- Why you're drawn to us
- Your resumé and/or LinkedIn
Not sold? Learn more first
- How we hire
- We ask for your best work, and in return pay generously and have exceptional benefits
- Learn about the team you'd be working with
- Getting hiring right is key to diversity. Learn about how we think about this.
Completely open source, Python/Django/React/ClickHouse.