Senior LLM Engineer at Maitai (S24)

$100K - $225K • 0.10% - 0.75%

Reliable, self-improving enterprise AI

Redwood City, CA, US

Full-time

Any (new grads ok)

Apply now

About Maitai

Maitai manages the LLM stack for enterprise companies, enabling the fastest and most reliable inference. The future of enterprise AI revolves around mosaics of small, domain-specific models powering powerful, responsive agents, and Maitai is well positioned to capture the market. If you're looking at getting in early with a company redefining how large companies build with AI, then let's talk.

About the role

Skills: Torch/PyTorch, Python, Chatbots, Machine Learning, Reinforcement learning (RL), Natural Language Processing, Data Analytics

High-level

Join Maitai to reshape how enterprise companies build with open-source LLMs. You’ll be at the forefront, driving cutting-edge innovations in model fine-tuning, distillation, and automation to continuously enhance LLM performance. You’ll collaborate directly with founders, engineers, and enterprise customers, building the core management layer that defines enterprise AI infrastructure. We're scaling rapidly and looking for engineers who deeply understand open-source LLM ecosystems and can confidently automate and optimize model improvements at scale.

Low-level

You will lead the fine-tuning, distillation, and deployment of open-source LLMs tailored for enterprise customers. Your role involves:

Preparing, optimizing, and managing large-scale datasets for model training and continuous improvement.
Developing and automating sophisticated fine-tuning pipelines to enhance model accuracy, reliability, and inference speed.
Distilling models to smaller, faster, and more efficient variants without compromising performance.
Implementing new platform features in Python and Go to facilitate seamless dataset curation, correction, and augmentation.
Collaborating closely with our infrastructure team and partner model hosts to ensure scalable, reliable model deployments.

Who You Are

You've spent quite a bit of time in Unsloth notebooks and know your way around fine-tuning opensource models like llama, gemma, mistral, etc.
You recognize dataset balancing as an art as much as a science.
You subscribe to r/LocalLLaMA.
You've got a local model humming on your MacBook Pro to power Cursor.
You enjoy attending local AI meetups to see what others are working on in the space.

Why Join Us?

Massive technical challenges – Pioneering automated, continuous improvements for enterprise-grade opensource LLMs.
Ownership and Impact – Drive architecture decisions, shape core product offerings, and influence company strategy from day one.
Elite, collaborative team – Join a fast-moving environment working alongside top-tier engineers.
Equity Upside – Early-stage, meaningful equity ownership.
Zero Red Tape – Ship fast, iterate faster, and enjoy working without heavy processes or Jira epics.
Amazing Customers - Not to brag, but our customers are genuinely amazing to work with.

More You Need To Know

In-person role in downtown Redwood City, CA. Caltrain or parking pass, lunches, and Starbucks/Philz coffee provided.
Engineers own their product decisions. Engage directly with customers, set your own specs, and deliver meaningful features.
Merit-based opportunity growth. Prove your capability, and we’ll quickly expand your responsibilities.

About Us

Maitai ensures LLMs never fail by optimizing reliability, speed, and resilience. Acting as an intelligent proxy, we apply real-time autocorrections, route requests intelligently, and fine-tune models for maximum performance. We're experiencing explosive growth, are well-capitalized, and seizing a massive opportunity to redefine how enterprises build with AI. Our platform delivers AI models that significantly outperform closed-source alternatives in speed and accuracy, supported by robust online guardrails. Leading YC startups and public enterprises trust Maitai to manage their LLM infrastructure.

Technology

Infra

As LLMs are core to our customers' products, resiliency and uptime are our top priorities. Since we act as a proxy, our uptime must exceed that of the providers themselves. We’re multi-cloud, multi-region, and built for seamless failover. Our infrastructure runs on Kubernetes, managed with Terraform, and deployed across AWS and GCP. We use GitHub Actions for CI/CD, with Datadog for monitoring, tracing, and performance insights.

Infra stack: Kubernetes, Terraform, AWS, GCP, GitHub Actions, PostgreSQL, Redis, Datadog.

Backend

Our backend is a set of microservices running Python with Quart for web services and Python-based fine-tuning jobs optimized for speed, cost, and accuracy. We use PostgreSQL for conventional data persistence and vector storage. Go is being introduced where performance gains are critical.

Tech stack: Python (Quart), Go (in transition), PostgreSQL.

Frontend

Tech stack: React (Typescript)

Interview Process

Quick Chat (15-minute Video Call)

Let’s discuss your experience, interests, and ambitions.
Tech Discussion

Get on a call and talk tech. What's going on in the industry, what have you worked with recently, latest model you've fine-tuned, last meetup you were at, etc.
Hands-On Technical

Join us at our office to work through a problem with our team.
In-person Meetup

Coffee or lunch with our team. Assess fit from both sides and move quickly to a decision.