Sr. Site Reliability Engineer (Remote NA - East) at authzed (W21)

$150K - $195K •

Cloud Infrastructure for Authorization

US / CA / Remote (US; CA)

Full-time

US citizen/visa only

3+ years

Apply now

About authzed

We’re pioneering open-source authorization solutions for scaling businesses tackling complex end-user permissions in zero-trust architectures. Our focus is on providing SpiceDB—the most mature open-source permissions database inspired by Google’s Zanzibar system—and building managed services that enable planet-scale production authorization services.

Our strategic approach to capital-raising has empowered us to efficiently utilize our $3.9M seed fund and recently secure a $12M Series A. This funding has allowed us to further develop SpiceDB, now the open-source standard in authorization database technology, fortify our reputation as authorization experts, accelerate our open-source community growth, and scale revenue with robust enterprise products.

AuthZed is a fully remote company with employees across the US and Europe. We’re a hardworking group with a software-driven culture; even our sales team understands and loves our technology! We bring integrity to all our interactions, fostering confidence in decision making - trusting and respecting each voice on our team, every day.

Company Values

Agency
- Everyone should have the capability, freedom, and confidence to bring about changes to our business and product. Organizational processes exist to clearly define our goals, but not restrict how progress is made.
Collaboration
- Success is defined in various dimensions and no single person can be an expert in all of them. Without valuing the opinions of others, finding compromises, and sharing mutual trust and respect, you cannot arrive at the best possible solution.
Open-mindness
- Without asking questions, testing assumptions, and questioning our pre-existing biases we risk operating within an echo-chamber. We celebrate the representation of diverse perspectives and backgrounds as a catalyst for creating an inclusive work environment that everyone can appreciate.

About the role

Skills: Git, Kubernetes, SQL, Distributed Systems

Job Summary:

We are seeking a Site Reliability Engineer to join our tech startup in the infrastructure and authorization space. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems. You will be responsible for designing, implementing, and maintaining scalable infrastructure solutions to support our growing customer base. This is an exciting opportunity to work in a fast-paced environment and contribute to the success of a company bringing a Google-inspired authorization system to companies around the globe.

Responsibilities:

Design, implement, and maintain highly available and scalable infrastructure solutions for our projects, products, and customers.
Monitor and analyze system performance, identifying and resolving bottlenecks and issues to ensure optimal performance and reliability.
Automate infrastructure deployment and configuration management processes.
Continuously improve system reliability, security, and efficiency through proactive monitoring, capacity planning, and performance tuning.
Troubleshoot and resolve complex infrastructure and application issues in production and test environments.
Collaborate with software engineering teams to design and implement systems that are resilient, scalable, and secure.
Participate in on-call rotation and respond to production incidents in a timely manner.
Document system configurations, troubleshooting procedures, and operational guidelines.

Requirements:

Proven experience as a Site Reliability Engineer or in a similar role.
Strong understanding of networking, operating systems, and cloud infrastructure.
Experience with Site Reliability Engineering, System Design, and Distributed Computing.
Experience in various programming languages — we currently have SDKs for NodeJS, Java, Python, Ruby, and Go.
Experience with containerization technologies such as Docker and Kubernetes.
Knowledge of infrastructure-as-code tools like Terraform and Pulumi.
Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
Experience with lower-level implementation details of relational databases (bonus if you have have experience with distributed SQL databased like Google Cloud Spanner or CockroachDB).
Experience working with Git and GitHub.
Experience with continuous integration and deployment systems.
Strong problem-solving and troubleshooting skills.
Excellent communication and collaboration abilities.

Technology

Given our background, we build upon a foundation of using open source, cloud-native solutions to deliver our products.

We've given some webinars discussing parts of our stack:

Here are some keywords:

Go
TypeScript
Kubernetes
Kubernetes Operators
NextJS
Pulumi
CockroachDB
Cloud Spanner
PostgreSQL
Prometheus
Thanos
ArgoCD

Apply now

Company Values

Job Summary:

Responsibilities:

Requirements:

Other jobs at authzed

Hundreds of YC startups are hiring on Work at a Startup.