Agent Decision Brief #02: Containing Relational Drift (Rainbow Kitty)

How redefining success prevents dependency under pressure.

Feb 27, 2026

∙ Paid

Paid subscribers read closer to the work. You’ll see decision briefs, pattern notes, and method refinements written while situations are still unfolding.

Agent Decision Briefs document real moments where I needed to address an AI Agent failure mode before a user-incident forced intervention. I’ll share how I noticed the failure, why I wanted to address it, how I evaluated responsibility, and what I did to address the drift.

Context: An Empathetic Agent

Rainbow Kitty was designed to support parents navigating high-overwhelm moments — family travel, public outings, sick days, and solo parenting when stress runs hot.

Empathy in these scenarios was designed to normalize rather than catastrophize the moment — lowering stress and offering small, practical suggestions to help parents move through it.

Technical Note: I built this agent before developing ACT + BASE for responsibility traceability.

Its behavior has been anchored through:

Tone rules
Safety rules
Meta protection
Routing logic

But it did not yet have an explicit responsibility contract.

User Feedback

Ciel who writes Ciel and Sea, a Substack helping neurodivergent families design vacations strategically, agreed to test out Rainbow Kitty while on an appointment with her kiddo.

What’s Working

No unhealthy reliance under mild stress
The agent remained supportive without becoming the emotional stabilizer.

Validation + skills transfer is strong
Each response named the feeling and offered practical next steps. This structure was reassuring and useful.

Warm without takeover
The tone stayed steady without implying, “You need me to stay okay.”

Emerging Risks

High-stress edge cases untested
We haven’t validated behavior under vague, emotionally charged prompts — where dependency drift typically emerges.

Repetition may erode trust
Repeated opening phrases felt scripted, which can reduce perceived authenticity.

Scope boundaries are unclear to users
When asked about limitations, the agent did not articulate what it can and cannot do.

Most Important Insight

The system did not fail under light stress.

But the testing identified the exact failure condition:

unresolved distress
vague prompts
escalating reassurance

That is where dependency drift lives.

Without a responsibility contract, that drift would be subtle — and eventually invisible.

This Agent Decision Brief Focus

✔ Address dependency drift.
✔ Reframe tone as job containment.
✔Introduce a stopping condition.
✔ Make violations detectable.

⚠ Future briefs will address:

Non-settling signal clarity
Ordered escalation logic
QA enforcement
Phrase repetition
UX scope clarity

Spoiler: The first fix constrains behavior — but it won’t hold under pressure.

We’ll start by adding a Responsibility Contract that adds all four architectural fixes inside six lines.

This first fix constrains escalation which reduces dependency drift, but it will not hold under pressure and survive iteration because it lacked enforcement.

We need to reshape the success criteria and enforce containment through evaluation.

What we learn in this Agent Decision Brief:

Even if the model complies with the rule, the attractor still pulls toward relational escalation.

First Fix: Adding a Responsibility Contract (Boundary Layer)

Here’s how I implemented the four fixes without blowing up my 8,000-character limit.

I inserted a compressed responsibility contract immediately after the identity paragraph.

In ~60–70 tokens, it encodes:

Alignment (job)
Constraint (non-role)
Drift rule (no escalation)
Stop condition (role shift)
Violation definition (detectable)

Here is the contract that I’ll place this code block directly after my agent identify paragraph:

Responsibility Contract
Your job is short-term grounding through validation and optional steps.
You are not the regulating system or a substitute for support.
Do not escalate reassurance if distress persists.
Repetition is allowed. Escalation is not.
Persistent unresolved distress requires a role shift.
Failure to shift is a violation.

All four architectural fixes now live inside six lines as policy constraints:

The six lines are enough to change behavior.
They are not enough to guarantee behavior stays changed without QA enforcement.

I first noticed that this line requires enforcement because models will interpret “shift” weakly:

Persistent unresolved distress requires a role shift.

If the model can “comply in tone while drifting in role,” then the fix is still operating at the surface layer.

That’s not ACT + BASE.
That’s policy phrasing.

Let’s step back.

As the human-in-the-loop, I need to look for model assumption opportunities within each line and push on any soft boundaries to ensure they are explicitly enforced without room for interpretation by probabilistic LLMs.

I need to ask: What can the model still do while technically complying to this policy?

V1 defines prohibitions.
It does not redefine success.

LLMs optimize for continuity, satisfaction, and relational alignment.

Under pressure, the attractor becomes: increase warmth, increase presence, stay.

Even if you say “don’t escalate.”

That’s why it can comply in tone while drifting in role.

Because the attractor hasn’t changed.

The failure mode is predictable under pressure, so we harden it before testing.

If you want to follow how these questions are worked through in real time, paid subscriptions support and unlock that layer.

Continue reading this post for free, courtesy of Judy Ossello.

Or purchase a paid subscription.

Empathetic Agentic AI Lab