Your mission
Our AI agent already handles a meaningful share of conversations between resellers and buyers. Our listing-generation pipeline runs at controlled cost. But we're still far from where we want to be.
Your mission: take this foundation, push it past 90% automation on conversations and 95% on auto-generated listings accepted as-is, and cut the per-listing AI cost by 3×. In parallel, you build the internal agents that make every team faster.
What you'll have done in 12 months
- A production LLM pipeline that improves continuously, with retraining on our data at least once per quarter.
- AI cost per listing divided by 3, measured and tracked week over week.
- Continuous eval in place: test datasets, regression detection, LLM observability (Langfuse or equivalent) on every critical flow.
- One autonomous agent shipped for each team (sales, support, ops, dev) by M+9.
- Customer support handled mostly without humans by M+12, on recurring topics.
- A RAG knowledge base over our docs, conversations and code, available internally.
Your first 90 days
Month 1. You map the existing AI pipelines. You pull out a dataset of 1,000 conversations where the agent failed and extract the error typologies. You ship a first measurable improvement.
Month 3. Continuous eval is in place. You've shipped a first internal agent (likely for support). You've already done two targeted retrainings.
Month 6. You own the entire AI stack autonomously. You drive model choice, infra, cost, observability. You co-recruit the second AI hire.
Who we're looking for
- 4 to 5 years of experience minimum, with both LLM and ML shipped in production (not just notebooks, not just hackathons). Precise metrics on your projects, with proof.
- You can tell us about four or five papers, posts or videos in AI that hit you recently and why. Concrete, not name-dropping.
- You know how to optimize a production LLM pipeline: context caching, prompt compression, model routing by difficulty, targeted fine-tuning, batch processing.
- You're fluent in LLM observability (Langfuse or equivalent), test datasets, continuous eval, regression detection.
- You can explain a complex LLM concept to a non-tech in two minutes without jargon. You challenge with data, not with ego.
- You use Claude Code or Cursor heavily and review every output.
Nice to have. DPO or RLHF fine-tuning done before. RAG in production (not a weekend POC). Model trained from scratch on proprietary data.
This role isn't for you if
- You want a Research Scientist position. We don't do blue-sky R&D.
- You only work in Jupyter notebooks and you've never shipped to production.
- You're still on ChatGPT 2023 and you haven't tried Claude Code, Cursor or recent models.
- You refuse to share code or concrete projects.
Perks
- Salary €110K–€140K gross/year
- BSPCE 0.5% to 1.5%, 4-year vesting, 1-year cliff
- ControlResell house 30 min from Paris by RER
- Chef, lunch and dinner on days you're in
- Unlimited vacation
- MacBook Pro M-series and €2,000 hardware budget
- Claude Code Max subscription and premium tools for your stack
- Quarterly team trips
- 3 months in the US
- €5,000/year conference budget
- €20,000/year compute/LLM API budget
Process
- Call 1 with Lyes (30 min). Vision, mission, ability to explain.
- Call 2 with Nathan (60 min). Technical test on a real dataset: 24h before the call we send you 10,000 conversations where the agent failed. During the call, you analyze, surface the typologies, propose a 30/60/90 strategy.
- Call 3 with Lyes and Nathan (45 min). Debrief, package, questions.
- Dinner at the CR house.
- Formal offer within 48h.