Member of Technical Staff at Herdora (S25)
$180K - $230K  •  2.00% - 4.00%
Fast AI Inference
San Francisco, CA, US
Full-time
Will sponsor
Any (new grads ok)
About Herdora
About the role

Join our team to build the future of inference, GPU optimization and AI infrastructure. You'll work directly with the team to define our technical direction and build the core systems that power our GPU optimization platform.

What You'll Do

  • Build scalable infrastructure for AI model training and inference
  • Lead technical decisions and architecture choices

What We Look For

Core Technical Expertise

  • GPU Fundamentals: Deep understanding of GPU architectures, CUDA programming, and parallel computing patterns.
  • Deep Learning Frameworks: Proficiency in PyTorch, TensorFlow, or JAX, particularly for GPU-accelerated workloads.
  • LLM/AI Knowledge: Strong grounding in large language models (training, fine-tuning, prompting, evaluation).
  • Systems Engineering: Proficiency in C++, Python, and possibly Rust/Go for building tooling around CUDA.

Ideal Background

  • Publications or open-source contributions in inference GPU computing or ML/AI for code are a plus.
  • Hands-on experience with large-scale experiments, benchmarking, and performance tuning.

Other jobs at Herdora

fulltimeSan Francisco, CA, USMachine learning$180K - $230K2.00% - 4.00%Any (new grads ok)

internSan Francisco, CA, USMachine learning$6K - $10K / monthlyAny

internSan Francisco, CA, USMachine learning$6K - $10K / monthlyAny

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›