Cua is building the infrastructure that lets general AI agents safely and scalably use Computers and Apps like humans do.

With 9k+ GitHub stars in just 4 months and a seed round closed, we’re providing:

An open-source framework for building and evaluating general-purpose AI agents.
A cloud container platform for sandboxed, scalable agent execution environments.
A blueprint for what production-grade general agent systems should look like - backed by research.

Overview

Cua is building the infrastructure that enables general-purpose AI agents to safely and scalably use real computers and applications.

We're a small team backed by Y Combinator and top-tier investors, and our open-source tools are already used by thousands of developers. As a Research Intern, you’ll help prototype, test, and benchmark multi-modal LLM-based agents - from data pipelines to orchestration systems.

You’ll collaborate with engineers and researchers to turn cutting-edge ideas into real systems and benchmarks that can be shared with the community. This is a chance to contribute to open-source research, design experiments, and explore the frontiers of agentic AI.

Responsibilities

Generate and curate large-scale, high-quality multi-modal data (GUIs, browsers, system UIs)
Design and test single- and multi-agent systems for data and computer use
Automate benchmarking of agent orchestration (with or without human-in-the-loop)
Explore new training and inference techniques to boost reasoning and action-taking (e.g., RL-based agents)
Develop benchmarks, tools, and datasets to evaluate agentic capabilities on Cua
Collaborate with the founding team and contribute to research publications, open-source tools, and the broader community

Qualifications

Required:

Currently a PhD student in Computer Science or related field (strong Master’s considered)
Experience in applied research with a solid publication record
Familiarity with modern multi-modal or reasoning agents (e.g., OS-Atlas, Qwen, GUI-R1)
Hands-on experience with PyTorch, Python, and cloud compute (AWS, GCP, etc.)
Comfortable designing experiments, evaluating models, and working with multi-modal data
Excited by generative AI, agent systems, and pushing the boundaries of what’s possible

Preferred:

Experience with reinforcement learning or agent-based training methods
Prior contributions to open-source projects or benchmark design
Familiarity with large-scale dataset construction and evaluation pipelines
Interest in bridging research and engineering for real-world applications
Based in or able to spend time in SF/Bay Area (preferred), but remote OK

What We Offer

Research impact – Opportunity to publish, open-source, and influence open agent research
Hands-on projects – Work directly with engineers and researchers on cutting-edge systems
Open-source visibility – Contribute benchmarks and datasets used by the community
Flexible setup – Remote-friendly; SF-based team
Learning environment – Collaborate on projects at the intersection of infrastructure and AI research

How to Apply

Please include:

Your CV and GitHub/portfolio
A short note on a research problem you’d like to tackle
Bonus: try building something with Cua or suggest a benchmark idea — we notice contributors

This is a paid internship (3-month full-time preferred; part-time considered). Compensation will depend on location and experience.

Cua AI, Inc. is committed to fair and transparent opportunities. We encourage applicants from all backgrounds, identities, and walks of life to apply.

Personal data will be handled in accordance with the GDPR (EU Regulation 2016/679) and other applicable data privacy laws.

We're looking for different roles to help us push this vision forward - turning cutting-edge research prototypes into real, deployable systems.

If you’re obsessed with developer tools, infrastructure, and making AI agents go from toy demos to robust, real-world tools - we want to talk.