Skills: Python, Redis, Kafka, Google Cloud, Elasticsearch, Distributed Systems, Data Warehousing, ETL, Elastic Stack (ELK), PostgreSQLThe Mission
We're building an AI-powered insurance brokerage that's transforming the $900 billion commercial insurance market by automating processes that currently run on pre-internet systems. Fresh off our $8M seed round, we're looking for an exceptional AI Context Engineer who can architect and develop durable, scalable data pipelines that power our AI systems with high-quality context.
We believe the best context leads to the best decisions and outcomes. You'll build data engineering pipelines that pull information from various sources and push it into memory stores for AI agents, while simultaneously feeding our data warehouses for analytics. While your primary focus will be on context engineering, you'll also have the opportunity to lean into building AI agents themselves and engage in prompt engineering. Your work will directly enable our agents to make better decisions through richer context, while providing our growth and data teams with the insights they need to drive campaigns and optimize performance.
We're committed to "Staying REAL" with our AI systems - building agents that are Reliable, Experience-focused, Accurate, and have Low latency. You will work directly with the CTO, our applied AI engineers, the CEO, growth team, and sales team to execute on our AI vision with a bias toward action. We live by core principles: "There is no try, there is just do," "Actions lead to information, always default to action," and "Strong opinions lead to information." We need engineers who build and ship, not just plan and strategize.
Outcomes You'll Drive
- Build durable, scalable data pipelines that pull information from diverse sources into context/memory stores for AI agents
- Design and implement event sourcing architecture with distributed systems to ensure data integrity and reliability
- Create data infrastructure that feeds both AI context systems and analytics data warehouses
- Partner with applied AI engineers to develop optimal context systems for agent performance
- Participate in AI agent development and prompt engineering to better understand context needs
- Collaborate with the CTO on architectural decisions for data and AI systems
- Partner with growth, sales, and data science teams to define the right data events and metrics for capturing high-quality data
- Develop integrations with PostHog, ClickHouse, Turntable, and potentially Snowflake
- Implement vector databases like Qdrant for efficient AI context retrieval
- Design data schemas and models that optimize for both AI agent context and analytics use cases
You're Our Person If
- You're passionate about building data infrastructure that powers AI systems
- You have deep expertise with distributed systems, event sourcing, and data pipeline architecture
- You have some experience with or interest in AI agent development and prompt engineering
- You understand CAP theorem tradeoffs and can make appropriate architectural decisions for data systems
- You have experience with ELT/ETL tools like Apache Airflow, Temporal, Airbyte, or N8N
- You're experienced with vector databases and embedding models for AI context
- You have worked with analytics warehouses like ClickHouse, Snowflake, or BigQuery
- You can balance technical excellence with pragmatic solutions that deliver business value
- You collaborate effectively with applied AI engineers and understand their context needs
- You ship features daily and take immediate action instead of overthinking
- You embrace "there is no try, there is just do" as your engineering mantra
Hard Requirements
- Strong experience with Python and data engineering frameworks
- Deep understanding of distributed systems principles, CAP theorem tradeoffs, and event sourcing architecture
- Experience designing and implementing data pipelines using tools like Apache Airflow, Temporal, Airbyte, or N8N
- Proven track record building production data systems that power AI/ML applications
- Experience with vector databases (like Qdrant, Pinecone, or Weaviate)
- Basic understanding of AI agent development and prompt engineering concepts
- Familiarity with analytics tools like PostHog, ClickHouse, and data warehousing solutions
- Knowledge of data modeling and schema design for both operational and analytical purposes
- Strong problem-solving skills and ability to work in a fast-paced startup environment
- Previous experience at an early-stage startup preferred
- Must be based in San Francisco and work in-office 5.5 days per week (relocation assistance provided)
Our Tech Stack
We're building a modern, AI-native data infrastructure to power our growth:
Data & AI Infrastructure:
- Event sourcing architecture with distributed systems design for reliability
- Apache Airflow, Temporal, Airbyte, and N8N for data pipeline orchestration
- Qdrant and other vector databases for AI context storage and retrieval
- PostHog for product analytics and event tracking
- ClickHouse for high-performance analytics queries
- Turntable for data visualization and dashboarding
- Potential Snowflake integration for enterprise data warehousing
- Redis streams and PostgreSQL for operational data storage
- Logfire for comprehensive observability and analytics
AI Integration:
- Context-aware AI agents powered by your data pipelines
- Claude (Anthropic), GPT-4.1 (OpenAI), and select open source models
- RAG systems utilizing the context data you provide
- Custom embedding models optimized for our insurance domain
What You'll Build in Your First 90 Days
First Month:
- Set up core data infrastructure and implement event sourcing architecture
- Build initial data pipelines connecting our primary data sources to AI context stores
- Implement basic vector storage for AI agent context
- Design schema and event tracking plan with the growth and data science teams
- Establish data quality monitoring and alerting
- Shadow applied AI engineers to understand their context needs for agents
Second Month:
- Expand data pipeline coverage to include additional sources and destinations
- Implement more sophisticated context retrieval systems for AI agents
- Build integrations with PostHog and ClickHouse for analytics use cases
- Work with applied AI engineers to optimize context for specific agent tasks
- Participate in basic prompt engineering to better understand context requirements
- Create automated testing for data pipelines to ensure reliability
Third Month:
- Optimize performance and reliability of data pipelines
- Implement advanced context features like temporal awareness and entity relationships
- Build dashboards and reporting tools for monitoring AI context quality
- Assist in development of simple AI agents leveraging your context systems
- Collaborate with applied AI engineers to tune context systems for agent performance
- Scale the data infrastructure to handle increasing volumes and velocity
Our Data Philosophy
-
Context is King: The quality of AI decisions directly correlates with the quality of context available
-
Event-Driven Architecture: Design systems that capture, process, and react to events for maximum data fidelity
-
Single Source of Truth: Maintain consistency across operational and analytical data systems
-
Data-Informed Growth: Enable growth and sales teams with the right metrics and insights
-
CAP Theorem Understanding: Make intelligent tradeoffs between consistency, availability, and partition tolerance
-
Lambda Architecture Approach: Combine event streaming for real-time processing with batch processing for complete analytics
-
Action Orientation: Always default to action - ship code, gather data, and iterate rather than overthink or overplan
-
Execution Focus: There is no try, there is just do - we value engineers who build and ship, not just plan and strategize
-
Strong Opinions: Form and express clear viewpoints that can be tested against reality to generate valuable information
-
Observable & Accountable: Ensure comprehensive monitoring of all data systems and pipelines
Join Us To Transform the $900B Insurance Market
This is an early-stage role at a fast-moving startup, and you'll often experience the crawl-walk-run approach to building. You'll quickly prototype data pipelines and then push them into productionized systems that can scale. We're looking for people who can be creative in providing impact first, then take learnings from that impact and push them back into the system.
You should ideally have worked in an early-stage startup environment and understand the pacing. This is a fast-paced environment where we value ownership and quick, rapid feedback loops within the team. You'll work directly with the CTO, our applied AI engineers, the CEO, growth team, and sales team to execute on our vision with a bias toward action.
We require you to be in San Francisco and work from our office 5.5 days per week. We'll cover relocation costs and believe the best teams collaborate intensively in person.
Skills
Python, Event Sourcing, Distributed Systems, CAP Theorem, Lambda Architecture, Data Engineering, Apache Airflow, Temporal, Airbyte, N8N, Qdrant, Vector Databases, PostHog, ClickHouse, Turntable, Snowflake, ETL/ELT, Data Modeling, Event-Driven Architecture, AI Agent Development, Prompt Engineering