Machine Learning Engineer at Beam (W22)

$120K - $200K • 0.20% - 0.75%

Ultrafast Inference for AI Products

New York, NY, US / Remote

Full-time

Will sponsor

1+ years

Apply now

About Beam

Beam is a tool to quickly build machine learning-powered applications. Our platform helps developers run their code on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models — without managing any infrastructure.

Machine learning is eating software, but it’s still difficult for developers to leverage ML in their products. Today, companies are spending months building their own ML platforms, or relying on outdated tools that were originally designed for academics.

We believe that for ML to reach widespread adoption, the underlying infrastructure needs to be hidden from the user. We're building the fastest way for developers to go from an ML prototype to a production service.

About the role

Skills: Torch/PyTorch, GPU Programming, CUDA

Beam is an ultrafast AI inference platform. We built a serverless runtime that launches GPU-backed containers in less than 1 second and quickly scales out to thousands of GPUs. Developers use our platform to serve apps to millions of users around the globe. We're backed by Y Combinator, Tiger Global, and prominent developer-tool founders, including the founder of Snyk and former CTO of GitHub.

Our team works in-person in New York City, but we welcome remote applicants who are exceptionally qualified.

About the Role

In this role, you'll optimize inference performance for a wide range of models running on our platform. You will minimize latency, maximize throughput, and continuously experiment to achieve industry-leading performance.

Your work will directly impact millions of users worldwide.

Skills & Experience

Experience with the state-of-the-art inference stack (e.g., PyTorch, TensorRT, vLLM)
Familiar with modern AI workflows, like ComfyUI and LoRA adaptors for fine-tuning
Deep understanding of model compilation, quantization, and serving architectures
Familiarity with GPU architectures and comfort in diving into kernel-level optimizations to resolve performance bottlenecks
Experience programming with CUDA, Triton, or similar low-level accelerator frameworks

Benefits

Work on challenging and impactful engineering problems
Competitive salary and meaningful equity
Join a fast-growing pre-Series A company at the ground floor
Health, dental, and vision benefits with 90% coverage for employees and 50% for dependents
Opportunities to participate in events across the cloud-native and AI communities
Fitness stipend, learning budget, and much more

Technology

Our infra code is mostly in Go, but our backend APIs are in Python. We work extremely closely with our customers – they’re all in a communal Slack channel with us, so it’s important that you’re interested in interacting with them (luckily, our customers are all developers).

Apply now

About the Role

Skills & Experience

Benefits

Other jobs at Beam

Hundreds of YC startups are hiring on Work at a Startup.