How can LLMs learn to ‘read the room’?

News

Posted on Friday 27 February 2026

Dr Gricel Vazquez discusses her latest paper and why LLMs need to explain why they're doing what they're doing in order to build trust.

Have you ever tried to explain a complex technical problem to a colleague, only to realise halfway through that their eyes have glazed over? You probably stopped, adjusted your tone, skipped the jargon, and tried a different analogy. You "read the room". As we integrate more Artificial Intelligence (AI) and robots into our workplaces, from construction sites to remote agricultural settings, we face a similar challenge. These systems generate complex plans to achieve their goals, but if they can’t explain why they are doing what they are doing, we can’t fully trust them.

As a research associate at the Institute for Safety Autonomy, my work focuses on the task planning problem for multi-robot multi-human systems. In simple terms: how to split up a group of tasks and decide who does what and when, so the whole team can finish the mission successfully. In terms of research projects this includes:

Writing mission requirements in formal languages that avoid ambiguity inherent in natural language.
Leveraging multiple AI planners and probabilistic model checking tools to generate correct verified task plans for the agents to follow,
Developing frameworks to manage changes in the mission and requirements at runtime.
Explaining the rationale behind the automatically generated task plans to human operators.

In my most recent work, together with fellow researchers from CfAA and AI4Work, we leveraged Generative AI to generate smart explanations for complex task planning problems. A big part of the reliability of the generated text depends on effective prompt engineering (the art of asking the right question). But we can't expect every user to be a prompt engineer. We needed a system that does it for them. We developed a proof-of-concept approach called COMPASS (COgnitive Modelling for Prompt Automated Synthesis), a translator that sits between the user and the AI planner.

The core problem we are addressing is that a static prompt does not fit everyone. Imagine a busy construction site where robots are laying foundations while humans install electrical wiring and coordinate. You have a diverse group of people who need to understand the robot’s plan. An AI Expert wants to know about the mission success probability and "Pareto optimality" (when balancing several goals at once, such that improving one negatively impacts another). The Site Manager cares about costs, resources, and deadlines. The Worker just needs clear, step-by-step instructions on what to do next. If it gives a simple summary to the expert, it might hide critical risks. Furthermore, their ability to understand information changes throughout the day. An explanation that makes sense when you are fresh at 9 AM might be too complex when you are tired and distracted at 6 PM. Hence, COMPASS also incorporated a user's cognitive model to estimate the user's level of attention and understanding.

At the Centre our vision is to assure the safety and reliability of autonomous systems. But safety is not just about the robot avoiding collisions; it is about the human operator correctly understanding the robot's intent. If a site manager misunderstands a safety-critical plan because the explanation was too jargon-heavy, that is a failure of the system. However, flexibility brings a new question: how do we ensure these explanations are accurate, faithful, and trustworthy?

Emerging evaluation strategies, such as using LLMs-as-a-judge, can help assess faithfulness and consistency before explanations reach human operators. However, many of these metrics rely on a reference text for comparison. This raises a challenge: how do we evaluate a new explanation generated at runtime for a task plan that has never been seen before? Additionally, explanations can be generated at multiple levels of the task planning automated pipeline, from clarifying the overall goals in the problem definition, to justifying why Plan A was chosen over Plan B.

We are still learning, of course. But the path forward is clearer: for AI to be a true teammate, it must ensure that the correct unbiased information reaches the right person in the right way, while adapting to every teammate’s “language”.

Interested in Dr Vazquez's work?

View her published papers