Founding Fullstack (Infrastructure) Engineer at Confident AI (W25)

$100K - $200K • 1.00% - 3.00%

Open-Source Unit Testing for LLM Applications

San Francisco, CA, US / Remote

Full-time

Will sponsor

3+ years

Apply now

About Confident AI

Confident AI is the leading LLM evaluation platform that helps teams evaluate, test, benchmark, optimize, monitor, and red-team LLM applications. Powered by DeepEval, the go-to LLM evaluation framework with over 600k monthly downloads, 5.3k GitHub stars, and over 40 million evaluations conducted, Confident AI is trusted by hundreds of companies from leading startups to international corporations.

About the role

Skills: Python, TypeScript

What you'll be doing:\

Working on Confident AI, the DeepEval cloud platform.\
Scale Confident AI's backend infrastructure to process millions of evaluations a month.\
Deploying Confident AI on-premises for enterprises.\
Support our closed-source customers and help them with anything they might need.\
Occasionally, write interesting content around how you're scaling Confident AI's systems for the developer community.

You should be able to:\
Write SQL, and be an expert in scaling relational database systems (PostgresQL).\
Dockerize distributed, and have experience working with the AWS services such as EKS.\
Conduct on-premise deployments in our customers' cloud providers such as AWS, Azure, and GCP.\
Work with multi-tenant (authentication) systems.\
Follow best data practices to ensure we remain SOCII and HIPAA complied.\
Code proficiently and quickly in Python and Typescript.\
Work 6 days a week, we're not hiding we expect a lot from you.

Your work will:\
Be used by hundreds of engineering teams, all the way from individual developers to Fortune 500 companies.\
Enable hundreds of engineering teams to gain instantly visibility into the performance of their LLM applications that wouldn't otherwise be possible.\
Make DeepEval even more popular (counter-intuitively).\
Be respected and appreciated by our customers.

By joining us, you will:\
Bring LLM testing and evaluation to the largest companies available.\
Learn how to serve enterprise customers as a startup, in a relatively safe environment.\
Work closely with the founders, with the possibly of promoted to an executive role in the future.\
Be compensated highly, with generous founding equity. This also means that we expect a lot from you.

Technology

Confident AI is building an open-source LLM evaluation framework called DeepEval to help companies evaluate their LLM applications. While we provide the algorithms, companies are free to use their own LLMs for evaluation and our job is to make sure they get accurate evaluation results and a good user experience while using our framework.

Confident AI's commercial product brings DeepEval to the cloud. While DeepEval is great, it can only do so much as a testing framework that runs locally in notebooks or CI/CD pipelines. With Confident AI, companies can get instant access to benchmark and LLM testing reports, catch regressions at scale, and monitor LLM applications in production.

Interview Process

The entire process is usually remote and most communication happens over email or via video chat in Google Meet. We know that you may be interviewing elsewhere as well so am respectful of your time and will get back no later than 2 days of each step along the process.

The entire process has 4 steps and takes around 1.5 week in total:

Initial 15-30 minute phone screening interview.
One 30-45 minute technical interview.
One week fully-paid work trial.
Full-time offer.

You'll be working with the founders directly throughout the entire process.

Apply now

Other jobs at Confident AI

Hundreds of YC startups are hiring on Work at a Startup.