Lab

Inside the Lab.

The Lab is where we de-risk the hard stuff — fast experiments on new models, agent patterns and evals, so the systems we ship to clients are already battle-tested. It’s the reason our pilots don’t die in a slide deck.

We’d rather break it here than break it on your customers.

Most AI fails in the gap between a convincing demo and a system people rely on every day. The Lab exists to close that gap before a project ever starts.

Every pattern we use in production — an agent, a retrieval pipeline, a voice flow — earned its place by surviving a deliberately hostile test here first. That’s what lets us be honest on the first call about what will and won’t work.

How it works

01

Spike

A throwaway prototype in days, not weeks. We get a rough version of the hardest part working first, so we’re arguing about something real instead of a slide.

02

Pressure-test

We build a task-specific eval set and try to break it — adversarial inputs, edge cases, the queries that embarrass a demo. If it can’t survive the Lab, it doesn’t leave.

03

Productionise

Only what passes gets hardened: monitoring, human-in-the-loop gates, fallbacks and documentation. The thing your team uses on Monday has already been through the wringer.

What’s on the bench

Model research

We benchmark and fine-tune frontier and in-house models — including our own Lux foundation model.

Agent prototyping

Rapid spikes on agent architectures, tool-use and human-in-the-loop patterns before they hit production.

Evals & safety

Task-specific eval suites and guardrails so we can prove a system works before you trust it.

Voice & realtime

Low-latency phone agents tuned for natural turn-taking and a Kiwi ear.

Retrieval

Hybrid search and citation pipelines that keep answers grounded in your sources.

On-device & on-prem

Capable models that run inside a customer’s own boundary, no data leaving the country.

Read what we’ve learned, or put us to work.

Dig into the research notes, browse the blog, or bring us a problem worth pressure-testing.

Talk to us