At Mem0, we are charting new territory that will fundamentally reshape how AI systems understand and interact with users over time. Our proprietary memory engine will allow AI models to dynamically build context, remember past interactions, and tailor their responses in a customized way for each individual. This represents a seismic leap beyond the current stateless limitations of AI.
Role Summary:
Own the end-to-end lifecycle of memory features—from research to production. You’ll fine-tune models for extraction, updates, consolidation/forgetting, and conflict resolution; turn customer pain points into research hypotheses; implement and benchmark ideas from papers; and ship with Engineering to SOTA latency, reliability, and cost. You’ll also build evaluation at scale (offline metrics + online A/Bs) and close the loop with real-world feedback to continuously improve quality.
What You'll Do:
Fine-tune and train models for memory extraction, updates, consolidation/forgetting, and conflict resolution; iterate based on data and outcomes.
Read, reproduce, and implement research : quickly prototype paper ideas, benchmark against baselines, and productionize what wins.
Build evaluation at scale : automated relevance/accuracy/consistency metrics, gold sets, online A/B & interleaving, and clear dashboards.
Work closely with customers to uncover pain points, turn them into research hypotheses, and validate solutions through field trials.
Partner with Engineering to ship : design APIs and data contracts, plan safe rollouts, and maintain SOTA latency, reliability, and cost at scale.
Minimum Qualifications
Experience in RAG or information retrieval (retrieval, ranking, query understanding) for real products.
Model training/fine-tuning experience (LLMs/encoders) with a strong footing in experimental design and iteration.
Strong Python ; deep experience with PyTorch and familiarity with vLLM and modern serving frameworks.
Built evaluation for complex vision-and-language tasks (gold sets, offline metrics, online tests).
Able to orchestrate data pipelines to run these models in production with low-latency SLAs (batch + streaming).
Clear, concise communication with stakeholders (engineering, product, GTM, and customers).
Nice to Have:
Publications at venues like CVPR , NeurIPS , ICML , ACL , etc.
Experience with privacy-preserving ML (redaction, differential privacy, data governance).
Deep familiarity with memory/retrieval literature or prior work on memory systems.
Expertise with embeddings , vector-DB internals, deduplication , and contradiction detection.
We are using state of the art Gen AI technologies and inventing some novel algorithms which helps us model information in the way our human brain does.
fulltimeSan Francisco Bay Area / RemoteFull stack$150K - $180K0.10% - 0.20%3+ years
fulltimeIndia / Remote (IN)Full stack₹4M - ₹5M INR0.05%3+ years
fulltimeSan Francisco Bay Area / RemoteFull stack$150K - $180K0.10% - 0.20%3+ years
fulltimeIndia / Remote (IN)Full stack₹2.5M - ₹3.5M INR0.05%3+ years
fulltimeSan Francisco Bay Area / Remote$150K - $180K0.10% - 0.15%3+ years
fulltimeSan Francisco Bay Area / RemoteFull stack$165K - $195K0.05% - 0.10%3+ years
fulltimeIndiaFull stack₹4M - ₹5M INR0.05%6+ years
fulltimeSan Francisco Bay AreaFull stack$175K - $210K0.10% - 0.20%3+ years
fulltimeSan Francisco Bay Area / RemoteFull stack$150K - $180K0.10% - 0.20%3+ years
fulltimeSan Francisco, CA, US / RemoteFull stack$150K - $180K0.10% - 0.20%3+ years
fulltimeSan Francisco Bay Area / RemoteFull stack$150K - $180K0.10% - 0.20%3+ years
fulltimeSan Francisco Bay Area / Remote (IN)Full stack₹3M - ₹4.5M INR0.05%3+ years