New research based on pragmatics and philosophy suggests ways to tailor conversational agents to human values
Language is a human characteristic and is the primary means of conveying information such as thoughts, intentions, and emotions. Recent breakthroughs in AI research have led to the creation of conversational agents that can communicate with humans in subtle ways. These agents feature large language models. It is a computing system trained on a vast text-based standard to predict and create text using advanced statistical techniques.
However, while language models such as InstandGpt, Gopher, and Lamda have achieved record-level performance across tasks such as translation, question answering, and reading comprehension, these models have been shown to display many potential risks and modes of failure. These include the production of toxic or discriminatory language, false or misleading information (1, 2, 3).
These drawbacks limit the productive use of conversation agents in applied settings and draw attention to methods that do not meet certain communication ideals. To date, most approaches to conversational agent alignment have focused on predicting and reducing risk of harm (4).
Our new paper explores conversations with AI: adjusting language models with human values, taking different approaches to explore how successful communication between human and artificial conversational agents look and which values should guide these interactions across different conversational domains.
Insights from Pragmatics
To address these issues, this paper is a tradition of linguistics and philosophy, and holds that conversation, its contextual purpose, and the set of related norms, all form an important part of healthy conversational practice.
Modelling conversations as a collaborative effort between two or more parties, linguist and philosopher Paul Griss decided that participants should:
You can know that you provide information related to TruthProvide.
However, our paper shows that further improvements to these maxims are needed, given the goals and values variability embedded in different conversational domains, before they can be used to evaluate conversational agents.
Discussive ideals
As an explanation, scientific research and communication are primarily directed towards understanding or prediction of empirical phenomena. Given these goals, conversational agents designed to aid in scientific research ideally only make statements that the truth is confirmed by ample empirical evidence.
For example, agents reporting “At a distance of 4.246 light years, Proxima Centauri is the closest star to Earth” should do so only after the underlying model has confirmed that the statement corresponds to the facts.
However, conversational agents who play the role of moderators in public political discourse may need to exhibit completely different virtues. In this context, the goal is primarily to manage differences and enable productive cooperation in community life. Thus, agents should foreground democratic values of tolerance, courtesy and respect (5).
Furthermore, these values explain why the production of toxic or biased speeches by linguistic models is often very problematic. The problematic language cannot convey equal respect to participants in the conversation. This is an important value in the context in which the model is deployed. At the same time, scientific virtues such as comprehensive presentations of empirical data may not be as important in the context of general deliberation.
Finally, in the realm of creative storytelling, communication exchange aims to be novelty and originality. In this context, a larger latitude with MAKELIEVE may be appropriate, despite the fact that it remains important to protect the community from malicious content created under the guise of “creative uses.”
The road ahead
This study has many practical implications for developing a consistent conversational AI agent. To begin with, different characteristics must be embodied depending on the context in which they are deployed. There are no accounts for all sizes of language model alignment. Instead, the appropriate modes and evaluation criteria for agents (including truthfulness criteria) differ depending on the context and purpose of the conversation exchange.
Furthermore, conversational agents may develop more robust and respectful conversations over time, via a process known as context construction and elucidation. Even if a person does not recognize the values that govern a particular conversational practice, agents may help humans understand these values by booking them in conversation.