Responsible AI

Responsible AI

"Responsible AI" has become a phrase you can stamp onto a marketing deck without it meaning anything specific, which is why we have written down what it actually means to us – in concrete commitments we would be embarrassed to break. The reason this document is long and dense rather than a tidy list of slogans is that the slogans have proven too easy to ignore. We would rather write the boring, specific version than the one that wins design awards. This page describes how we make decisions when building agents, retrieval pipelines, voice systems, classification models, automation workflows and the integration code that wires them into a business. It is not a regulatory disclosure and it is not legal advice; it is a public statement of practice that we hold ourselves to, that you are entitled to hold us to, and that we will update as we learn.

1. A human signs off on anything consequential

We do not ship AI systems that take consequential action on a business's behalf without a human in the approval loop. The line we draw is straightforward: if a mistake costs money, damages a customer relationship, alters a clinical record, sends a legally binding message, transfers funds, or commits the organisation to anything, a person reviews and approves the action before it goes out. Reversible, low-stakes operations – classifying an inbound email, suggesting a draft reply, surfacing a related document, querying a public dataset – can run autonomously. The shape of "consequential" is decided at the start of every engagement and documented in the system's runbook, not left implicit. This is the principle we have least patience for cutting corners on, because the externalities of an unsupervised model getting it wrong land on the worker who is later blamed for not catching the error.

2. Answers are cited, or they are refused

Every retrieval-augmented system we build is configured to refuse rather than confabulate when the underlying source material does not contain a clean answer. We instrument retrieval confidence at the chunk level, set per-deployment thresholds in collaboration with the client, and route low-confidence questions to a human or to a "we do not know" response. Every cited answer surfaces the actual source document and, where possible, the exact passage – not just a vague pointer to a folder. We do not consider it acceptable for an AI assistant to invent a policy clause, fabricate a case reference, manufacture a statistic, or paper over a knowledge gap with prose that sounds authoritative. When we discover a hallucination in a system we have built, that incident gets a post-mortem and the eval suite is updated to catch its shape going forward.

3. Your data stays under your control

Wherever architecture allows it, we run inference inside your own cloud account or against model endpoints under your own contract, so that the data never leaves a perimeter you control. Where we use shared model providers, we choose ones with contractually binding commitments not to train on customer inputs or outputs, and we surface those commitments to you in writing during scoping. We can produce a data-flow diagram on request that shows every byte of your information at rest and in transit, including where it is encrypted, who has key access, and how long it lives. We do not warehouse client data in shadow accounts and we do not "anonymise" it for our own use; the only patterns we keep from an engagement are generic ones that have no relationship to your business.

4. Evaluation is part of the build, not an afterthought

Every system we ship comes with a written evaluation suite covering the failure modes that would matter to you – not just the happy-path demos that look good in a slide. The suite includes adversarial inputs, edge cases drawn from real production logs, fairness checks where relevant to the use case, and a representative sample of cases the model is expected to refuse. We share the evaluation results with you at handover, including the cases we failed, and we set up automated re-runs on every model update so that regressions caused by an upstream provider changing their model are detected before they reach your users. Evaluation cost is a line item in our quotes; we will not strip it out to win price-sensitive deals, because doing so converts the responsibility for shipping a fragile system onto you.

5. Te Tiriti o Waitangi is part of the brief

When a project touches te reo Māori, mātauranga Māori, or data about Māori communities, we engage with appropriate kaitiaki before, during and after the build – not as a tick-box review at the end. We respect data sovereignty as a substantive constraint, not a courtesy, which sometimes means a system architecture is different than it would be elsewhere, with data flows or storage locations chosen specifically to honour those commitments. We will name our limits honestly: we are not a Māori-led organisation and we do not pretend to authority we have not earned. Where a project needs guidance that sits outside our experience, we say so up front and bring in collaborators who can lead that conversation properly.

6. The user always knows when they are talking to AI

Every conversational system we deploy identifies itself as an AI within the first interaction. A voice agent answering a phone says so within the opening sentence. A chatbot's interface labels itself unambiguously and does not impersonate a named staff member. An email-drafting assistant does not sign messages from a person who did not write them. We do not build systems whose value depends on deception, and we will refuse work that requires it, even when the requesting party considers the deception harmless. The distinction we hold is simple: the user gets to know what they are interacting with, and our refusal to obscure that is not negotiable.

7. There are uses we will not build

We do not build autonomous weapons systems or targeting infrastructure for them. We do not build mass surveillance systems that collect or analyse identifying data on people who have not consented and are not subject to a court-supervised legal process. We do not build engagement-maximising systems whose business case relies on exploiting attention or amplifying addictive behaviours. We do not build social scoring systems that affect people's access to credit, housing, employment or services based on opaque models. We do not build deceptive marketing tooling. Some of these refusals will look obvious; others will cost us work that other shops are willing to do, and that is acceptable to us.

8. The productivity dividend belongs to the workers

The reason we are in this industry is to move New Zealand toward a shorter working week with no loss of income for the people doing the work, and every engagement we take is supposed to nudge the dial in that direction. When a system we build means a team needs fewer hours to do the same volume of work, our default recommendation is that those hours are given back to the team, not used to justify a headcount cut. This is a strong preference rather than a hard refusal – business circumstances are rarely simple – but we will not work on engagements whose explicit, written objective is to replace a workforce with software and pocket the difference. We are also happy to be questioned about how a specific build squares with this principle, and we expect you to push back if our answer is hand-wavy.

9. When we get it wrong

The commitments above are not aspirational. If you discover that a system we built or operated has broken one of them, email us at [email protected]. We will acknowledge the report within one New Zealand working day, investigate within a week, fix the underlying issue at our cost regardless of whose contractual responsibility it is, and publish a short public write-up at the bottom of this page describing what happened and what we changed. We are early enough in this work to make mistakes; we are not so early that we get to hide them.

10. Updates to this page

This is a living document. We expect to add principles, sharpen the wording on existing ones, and occasionally retire commitments that have been superseded by a stronger one. Every meaningful change updates the date at the top of this page; substantial rewrites get a short note in this section describing what shifted and why, so that the history of our thinking is visible rather than quietly edited away.