Founding Open-Source Growth Engineer at Confident AI (W25)
$100K - $200K  •  1.00% - 3.00%
Open-Source Unit Testing for LLM Applications
San Francisco, CA, US / Remote
Full-time
Will sponsor
6+ years
About Confident AI

Confident AI is the leading LLM evaluation platform that helps teams evaluate, test, benchmark, optimize, monitor, and red-team LLM applications. Powered by DeepEval, the go-to LLM evaluation framework with over 600k monthly downloads, 5.3k GitHub stars, and over 40 million evaluations conducted, Confident AI is trusted by hundreds of companies from leading startups to international corporations.

About the role
Skills: Growth design, Marketing design, Python

What is Confident AI?

We’re building 1) an open-source package called DeepEval to unit-test LLM applications such as chatbots, agents, and RAG pipelines, and 2) the cloud platform for DeepEval. 

It's like Next.JS and Vercel.

The founding team is a small group of exceptional engineers and researchers from top colleges and companies such as Google, Microsoft, and Princeton.

Our Values and Morals

Things we value:

  • Earnest and hardworking, the most important trait.
  • No excuses or BS—if something is wrong, surface it so someone can help.
  • Openness and transparency—hiding a problem won’t make it go away.
  • No politics, micromanagement, or bureaucracy, even in controversial discussions.
  • Autonomy, ownership, and responsibility—just as expected from any grown adult.
  • No ghosting—respect others’ time and effort.
  • Doers, not yappers, function over form. This means we're ok with remote work as long as you deliver.

Job Description

This role is 50% engineering, 50% developer marketing (may vary slightly based on time of the month). It is important that you understand the growth role isn’t a traditional SWE role where writing code is the majority of your day to day.

What you'll be doing:\

  • Working on DeepEval (one of the used package for LLM evaluation in the world) for both LLM evaluation features and also LLM red teaming features for DeepTeam.
  • Write high quality content around what you've built in the form of documentation and blog articles for the open-source community.
  • Be able to post and repurpose content on platforms such as reddit, Twitter, community articles, LinkedIn, and other relevant channels.
  • Be able to define and measure growth, what it means for a growth channel to be successful, and find new ways to increase DeepEval/Team’s distribution.
  • Be able to take charge of the open-source community in both discord and github.
  • Support integrations with other open-source projects and form relationships with them

Support our open-source community for any questions and help they might need.

You should be someone who:

  • Enjoys writing and posting content on developer marketing channels.
  • Have a taste for good documentation, design, and constantly find ways to improve how our packages are presented to the world.
  • Already is proficient in open-source contributions, your github profile should be as green as a carpet.
  • Is a quick learner, you will be picking up on how to do SEO, GEO, and other existing strategies within the company.
  • Enjoy reading papers, and have a natural curiosity for new research.
  • Code extremely proficiently and quickly in Python and Typescript.
  • Communicates well.
  • Work 6 days a week, we're not hiding we expect a lot from you.

Your work will:

  • Be used by hundreds of thousands of open-source users, all the way from individual hobbyist to AI leaders at Fortune 500 companies to companies such as OpenAI and Google.
  • Educate hundreds of thousands of people, that wouldn't otherwise know how to quality assure their LLM applications.
  • Help grow DeepEval into the Nextjs of evals

By joining us, you will:

  • Be shaping the future of LLM testing and evaluation.
  • Learn how to run and do startups, in a relatively safe environment.
  • Work closely with the founders, with the possibly of promoted to an executive role in the future.
  • Be compensated well, with generous founding equity. This also means that we expect a lot from you.
Technology

Confident AI is building an open-source LLM evaluation framework called DeepEval to help companies evaluate their LLM applications. While we provide the algorithms, companies are free to use their own LLMs for evaluation and our job is to make sure they get accurate evaluation results and a good user experience while using our framework.

Confident AI's commercial product brings DeepEval to the cloud. While DeepEval is great, it can only do so much as a testing framework that runs locally in notebooks or CI/CD pipelines. With Confident AI, companies can get instant access to benchmark and LLM testing reports, catch regressions at scale, and monitor LLM applications in production.

Interview Process

Our Hiring Process

The entire process is usually remote and most communication happens over email or via video chat in Google Meet. We know that you may be interviewing elsewhere as well so am respectful of your time and will get back no later than 2 days of each step along the process.

The entire process has 4 steps and takes around 1.5 week in total:

  • Initial 15-30 minute phone screening interview.
  • One 30-45 minute technical interview.
  • One week fully-paid work trial.
  • Full-time offer.

You'll be working with the founders directly throughout the entire process.

Other jobs at Confident AI

fulltimeSan Francisco, CA, US / Remote (US)Backend$100K - $200K0.50% - 1.50%3+ years

fulltimeSan Francisco, CA, US / RemoteMachine learning$100K - $200K1.00% - 3.00%6+ years

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›