Machine Learning Engineer at Beam (W22)
$120K - $200K  •  0.20% - 0.75%
Ultrafast Inference for AI Products
New York, NY, US / Remote
Full-time
Will sponsor
1+ years
About Beam

Beam is a tool to quickly build machine learning-powered applications. Our platform helps developers run their code on serverless GPUs, deploy highly performant APIs, and rapidly prototype ML models — without managing any infrastructure.

Machine learning is eating software, but it’s still difficult for developers to leverage ML in their products. Today, companies are spending months building their own ML platforms, or relying on outdated tools that were originally designed for academics.

We believe that for ML to reach widespread adoption, the underlying infrastructure needs to be hidden from the user. We're building the fastest way for developers to go from an ML prototype to a production service.

About the role
Skills: Torch/PyTorch, GPU Programming, CUDA

Beam is an ultrafast AI inference platform. We built a serverless runtime that launches GPU-backed containers in less than 1 second and quickly scales out to thousands of GPUs. Developers use our platform to serve apps to millions of users around the globe. We're backed by Y Combinator, Tiger Global, and prominent developer-tool founders, including the founder of Snyk and former CTO of GitHub.

Our team works in-person in New York City, but we welcome remote applicants who are exceptionally qualified.

About the Role

In this role, you'll optimize inference performance for a wide range of models running on our platform. You will minimize latency, maximize throughput, and continuously experiment to achieve industry-leading performance.

Your work will directly impact millions of users worldwide.

Skills & Experience

  • Experience with the state-of-the-art inference stack (e.g., PyTorch, TensorRT, vLLM)
  • Familiar with modern AI workflows, like ComfyUI and LoRA adaptors for fine-tuning
  • Deep understanding of model compilation, quantization, and serving architectures
  • Familiarity with GPU architectures and comfort in diving into kernel-level optimizations to resolve performance bottlenecks
  • Experience programming with CUDA, Triton, or similar low-level accelerator frameworks

Benefits

  • Work on challenging and impactful engineering problems
  • Competitive salary and meaningful equity
  • Join a fast-growing pre-Series A company at the ground floor
  • Health, dental, and vision benefits with 90% coverage for employees and 50% for dependents
  • Opportunities to participate in events across the cloud-native and AI communities
  • Fitness stipend, learning budget, and much more
Technology

Our infra code is mostly in Go, but our backend APIs are in Python. We work extremely closely with our customers – they’re all in a communal Slack channel with us, so it’s important that you’re interested in interacting with them (luckily, our customers are all developers).

Other jobs at Beam

fulltimeNew York, NY, US / RemoteFull stack$72.5K - $155K0.20% - 0.70%1+ years

fulltimeNew York, NY, US / RemoteBackend$173K - $202K0.25% - 1.00%3+ years

fulltimeNew York, NY, USFull stack$106K - $166K0.20% - 0.90%1+ years

fulltimeNew York, NY, USBackend$103K - $155K0.25% - 1.00%Any (new grads ok)

fulltimeNew York, NY, US / RemoteMachine learning$120K - $200K0.20% - 0.75%1+ years

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›