Back to Insights
Hiring

How to hire your first ML engineer (without lighting cash on fire)

The first ML hire shapes every later one. Here's the framework we use with founders making this decision for the first time — what to look for, what to ignore, and how to avoid the three most common failure modes.

HE
HeadHunted Editorial
· · Pre-publication draft

The first ML engineer at any company anchors everything that follows — culture, technical bar, hiring screen, even what kinds of customers you can credibly serve. Get it wrong and the next two hires are spent fixing the first one's decisions. Get it right and the trajectory compounds for years.

We've placed over 30 first-ML hires in Melbourne over the last four years. This is the framework we walk first-time hiring managers through.

Step one: decide what you actually need

Three patterns we see, all branded "ML engineer" but each requires a fundamentally different person.

Pattern A — Applied AI engineer. You have a product, you want to add an AI feature, you don't intend to train models from scratch. You need someone who can reach for OpenAI / Anthropic / Cohere APIs, build RAG, manage prompt eval, integrate guardrails. Often a strong full-stack engineer with 12 months of GenAI side-project experience is the right hire here, not a PhD.

Pattern B — ML systems engineer. You have data, you want predictions, you're going to train models on your own data. You need someone with classical ML breadth (regression, ranking, recommender systems, classification) and the operational chops to ship them — feature pipelines, batch + online inference, monitoring. This person bridges data engineering and ML.

Pattern C — Research-leaning ML engineer. You're solving a problem where state-of-the-art matters and the easy answers are taken. You need someone who reads papers, runs experiments, and ships. Rare profile, expensive, often comes from FAANG research orgs or PhD programs.

If you can't pick one, you don't yet have a clear enough problem statement to hire for. That's normal. Spend two weeks scoping before you spend three months recruiting.

Step two: write the job ad like a recruiter, not an engineer

Engineering-written ads are full of stack lists and required years. Recruiter-written ads describe the work, the impact, and the specific candidate profile. Three things every first-ML ad should include:

  1. What the model is being built for. Not "you'll work on ML" — "you'll be the technical owner of our churn prediction model, which currently drives our retention team's outreach list and saves an estimated $X / month."
  2. Who they'll work with. Solo? On a small team? Reporting to a CTO who codes? Reporting to a product manager who doesn't?
  3. The first six months. Concrete deliverables. "Ship a v1 model in production by month 3, deliver feature pipeline by month 4, add monitoring + retraining cadence by month 6."

Vague ads attract vague applicants. Specific ads attract people who match the specifics.

Step three: design a screen that selects on the right axis

Common mistake: importing the FAANG-style coding interview wholesale. It selects on raw algorithmic ability. Most first-ML hires don't fail on that axis — they fail on production engineering, ambiguity tolerance, or product sense.

Better screen for a first-ML hire:

  • 30-min call — discuss a project they shipped end-to-end. Probe on tradeoffs, not implementation details.
  • Take-home (90 minutes max, paid) — a small modelling exercise on synthetic data, evaluated on the writeup as much as the code.
  • Pair-programming session (60 mins) — extend their take-home in a direction they didn't anticipate. Tests how they handle scope change, not whether they memorised numpy.
  • Architecture conversation (45 mins) — present a real problem from your business, ask how they'd approach it, including data, model, infra, monitoring.

Skip whiteboard algorithm questions for this hire. There's nearly no role-relevant signal in them.

Step four: avoid the three classic failure modes

1. Hiring for the model you wish you had, not the data you actually have. If your data is messy spreadsheets you need a senior data engineer who happens to do ML, not a research scientist.

2. Hiring a researcher when you needed a builder. Researchers excel when SOTA matters. If your problem is solvable with a well-tuned XGBoost model, a researcher will be bored, frustrated, and gone in 14 months.

3. Hiring on prestige. "Ex-FAANG" reads great in a press release but says nothing about whether the person can ship from zero. The best first-ML hires we've placed often have zero brand-name companies on their resume — they've shipped end-to-end at smaller orgs, where every decision was theirs.

Step five: know when to bring in a recruiter

Internal hiring works fine if (a) your network is rich in this profile, (b) you can write a sharp ad, and (c) you can move from first conversation to offer in under three weeks. If any of those isn't true, you'll lose your top candidates to faster-moving companies.

The economics of using a recruiter for a first-ML hire usually pencil out: the cost of a wrong hire is 6-12 months of opportunity, plus the salary you paid, plus the morale tax. Specialist recruiter fees pay for themselves the first time they save you from a wrong hire.

Tell us about the role — we'll either help you, point you to someone better, or honestly tell you it's a hire you should do internally.

#first-hire #founders #playbook

Want to Talk Hiring ?

If this resonated, get in touch — or send it to a hiring manager who needs to read it.