NU·WRIGHT

A chatbot trained for fluency is not, by default, trained for care. The defaults of a frontier model are confident, capable, and emotionally flat. Which is fine for code review and unhelpful for the moments that actually matter in advising, support, and clinical-adjacent work. relationalAI is the toolkit we wished we had: a small set of instruments for taking an off-the-shelf model and teaching it, carefully, to behave more like the practitioners whose work it is meant to support.

Fig. 1Five interlocking tools across two registers: three Builder, two Analyzer, with the LLM Evaluator threaded through.

1·Background

Project Evident commissioned the work after watching three pilot deployments of conversational AI in support contexts under-perform in identical ways. The bots were fluent. They knew their material. They were tonally wrong: too quick to solve, too eager to please, too prone to filling a silence the user needed.

The fix is not a better model. It is better scaffolding around the model, and a way of measuring whether the scaffolding is working.

2·Method

relationalAI ships five interlocking tools across two registers. Three Builder tools (a Prompt Builder, a RAG Package, and a Fine-Tuning data prep pipeline) graded from beginner to advanced so a team can adopt the depth that matches its capability. Two Analyzer tools (a Human Review surface and a Training Data exporter) close the loop, turning real conversations back into the corpus the model is taught from.

A custom LLM Evaluator sits alongside, scoring any conversation against a user-defined rubric on a 1–5 scale with reasoning and evidence quotes per dimension. The whole thing is open-source under MIT.

3·Findings

Across pilots, the practitioner-scoped pacing, sequencing, and repair protocols moved the bots from "polite but unhelpful" to "noticeably more like talking to one of us", by the assessment of the practitioners themselves. The artefact that mattered most was not the prompt; it was the 215 few-shot examples assembled across eight relational domains.

4·Status

The repository has just completed its pre-public hygiene pass. Lint clean, OSS scaffolding in, defensive patterns in the evaluator. A data-scientist client review is the next milestone, followed by public release. Open-sourcing partners and pilot organisations are welcome.

References

↗Project sitepe-rai.vercel.app