Senior Deep Learning Engineer at NanoNets (W17)
₹40 - ₹65 INR
Automatic Data Extraction
Bengaluru, KA, IN / Bengaluru, Karnataka, IN
Full-time
US citizen/visa only
3+ years
About NanoNets

Nanonets is automating document information extraction using AI. We are headquartered in San Francisco. We are backed by prestigious investors from bay area like Y-Combinator, SV Angels, Sound Ventures by Ashton Kutcher. We are currently profitable and growing at a fast pace and looking to expand our team.

We are building a product that lets companies automate extracting key information from documents like invoices, receipts, or any other kind of document and integrate it into their workflows saving manual work. We need to keep building features that will let users automate millions of documents of different kinds every day, feed them to our AI for learning, plug our API to external systems like salesforce, quickbooks, RPA providers etc.

You should check it out at https://app.nanonets.com

About the role

Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on Hugging Face and recognition from global AI leaders.

Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside Elevation Capital and Y Combinator, we're scaling our deep learning capabilities to serve enterprise clients including Toyota, Boston Scientific, and Bill.com. You'll work on challenging problems at the intersection of computer vision, NLP, and generative AI.

What You'll Build

Core Technical Challenges:

  • Train & Fine-tune SOTA Architectures: Adapt and optimize transformer-based models, vision-language models, and custom architectures for document understanding at scale
  • Production ML Infrastructure: Design high-performance serving systems handling millions of requests daily using frameworks like TorchServe, Triton Inference Server, and vLLM
  • Agentic AI Systems: Build reasoning-capable OCR that goes beyond extraction – models that understand context, chain operations, and provide confidence-grounded outputs

Optimization at Scale: Implement quantization, distillation, and hardware acceleration techniques to achieve fast inference while maintaining accuracy

  • Multi-modal Innovation: Tackle alignment challenges between vision and language models, reduce hallucinations, and improve cross-modal understanding using techniques like RLHF and PEFT

Engineering Responsibilities:

  • Design distributed training pipelines for models with billions of parameters using PyTorch FSDP/DeepSpeed
  • Build comprehensive evaluation frameworks benchmarking against GPT-4V, Claude, and specialized document AI models
  • Implement A/B testing infrastructure for gradual model rollouts in production
  • Create reproducible training pipelines with experiment tracking 
  • Optimize inference costs through dynamic batching, model pruning, and selective computation

We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.

Technical Requirements

Must-Have:

  • 4+ years of hands-on deep learning experience with production deployments
  • Strong PyTorch expertise – ability to implement custom architectures, loss functions, and training loops from scratch
  • Experience with distributed training and large-scale model optimization
  • Proven track record of taking models from research to production
  • Solid understanding of transformer architectures, attention mechanisms, and modern training techniques
  • B.E./B.Tech from top-tier engineering colleges

Highly Valued:

  • Experience with model serving frameworks (TorchServe, Triton, Ray Serve, vLLM)
  • Knowledge of efficient inference techniques (ONNX, TensorRT, quantization)
  • Contributions to open-source ML projects
  • Experience with vision-language models and document understanding
  • Familiarity with LLM fine-tuning techniques (LoRA, QLoRA, PEFT)

Why This Role is Exceptional

  • Proven Impact: Our models approaching 1 million downloads – your work will have global reach
  • Real Scale: Your models will process millions of documents daily for Fortune 500 companies
  • Well-Funded Innovation: $40M+ in funding means significant GPU resources and freedom to experiment
  • Open Source Leadership: Publish your work and contribute to models already trusted by nearly a million developers
  • Research-Driven Culture: Regular paper reading sessions, collaboration with research community
  • Rapid Growth: Strong financial backing and Series B momentum mean ambitious projects and fast career progression

Our Recent Achievements

  • Nanonets-OCR model: ~1 million downloads on Hugging Face – one of the most adopted document AI models globally
  • Launched industry-first Automation Benchmark defining new standards for AI reliability
  • Published research recognized by leading AI researchers
  • Built agentic OCR systems that reason and adapt, not just extract
  • Secured $40M+ in total funding from Accel, Elevation Capital, and Y Combinator

Technology

Some of the interesting things our backend team has shipped

  • Compile python code into C which could be imported into golang and then shipped as binary for on premise systems
  • Autoscale GPU dependent services with kubernetes with a custom metric
  • Displaying machine learning metrics in simplified ways to end users so they can act based on those metrics
  • Building large number and variety of integrations with relatively generic interface like salesforce, quickbooks, RPA's, external databases
  • Process large number of files in highly distributed manner in golang

Some of the interesting things our frontend team has shipped

  • Ability for users to annotate documents so AI can learn which fields to extract
  • Displaying machine learning metrics in simplified ways to end users so they can act based on those metrics
  • Letting users build complex visual workflows around our API in our product.
  • Let users visualize complex ML metrics in a very simple and intuitive way

Our stack:

  • Databases
    • Cassandra DB
    • Postgres/MySQL
  • Backend
    • Golang for API and other microservices
    • Python for Machine learning (Tensorflow, Pytorch)
  • Frontend
    • React, Typescript
    • Mobx
  • Cloud Providers
    • AWS
    • GCP for ML heavy workload
  • Monitoring/Alerting
    • ELK for logging
    • Prometheus for Monitoring
    • Graphana for dashboards
  • Orchestration
    • Kubernetes
  • DevOps
    • Jenkins for CI/CD
Interview Process

Interview Process

  1. Introduction Call with Founder (30-45 min)
  2. Deep Learning Knowledge Round - Technical discussion on ML concepts and architectures (60 min)
  3. Deep Learning Coding - Hands-on implementation challenge (take-home or live)
  4. In-Person Interview at Bangalore office - Meet the team and dive deeper into technical and cultural fit

Other jobs at NanoNets

fulltimePalo Alto$140K - $200K0.25%Any (new grads ok)

fulltimePalo Alto, CA, USFull stack$90K - $140K1+ years

fulltimePalo Alto, CA, US / Remote (San Francisco, CA, US)$80K - $140K3+ years

fulltimePalo Alto, CA, US / Remote (Palo Alto, CA, US)Machine learning$240K - $300K3+ years

fulltimePalo Alto, CA, US / Remote (Palo Alto, CA, US)$100K - $120K0.25%3+ years

fulltimeBengaluru, KA, IN / Bengaluru, Karnataka, INMachine learning₹40 - ₹65 INR3+ years

Hundreds of YC startups are hiring on Work at a Startup.

Sign up to see more ›