FCA research finds financial services consumers keen on AI support

Pilots demonstrate potential of LLMs, but signal need for human intervention and the importance of good design and testing.

The UK regulator has published the results of two pilot projects exploring the application of LLMs in consumer-facing financial services.

In the first project, researchers at the regulator asked OpenAI’s GPT-3.5 and GPT-4 to generate simplified definitions of complex financial terms, which were tailored to lower reading ages and supported by relevant examples. FCA experts found the LLM responses were of a high-quality. However, testing revealed significant divergence in the ability of human operators and automated tools to evaluate the output.

One of the key lessons the regulator drew from this exercise was that validating outputs would require “a robust evaluation framework that combines human judgment with automated tools”, and that such tools are not a “perfect substitute for testing with consumers.” These seem uncontroversial conclusions.

The second pilot project was more interesting as it involved testing LLM-generated guidance with consumers. According to the FCA one of the key conclusions of the project was that “how LLMs are rolled out matters” (FCA italics ).

Customer journey

In this pilot a customer journey simulation for the selection of a savings account was created that included a very limited chatbot interface. The availability of the chatbot did not improve results, with those who had access to the chatbot only “worse at choosing the right savings account” when compared with users with access to the accompanying Q&A only or the Q&A in addition to the chatbot.

Engagement with the chatbot was also low, with the FCA admitting that this may have been a consequence of users not being able to pose their own questions, but having to use a selection of pre-generated prompts instead.

The study did find that “exposing people to a LLM chatbot meant they were more likely to want to use AI to help them make financial decisions in the future”, which probably means that, once available, widespread adoption of such tools by users would swiftly follow.

Information gathering

And while the responses to the FCA’s questions to this group around potential chatbot usage suggested that consumers were “keen for some type of automated support”, they also confirmed that such support would be deemed most useful if it was perceived as helping users in their decision making with value placed on the more efficient gathering of information from diverse sources, effective product comparisons and clear responses to complex questions. This is of course predicated on such tools delivering results that are consistently error-free and so deemed reliable by the end user.

According to the regulator “there are important opportunities to provide effective and tailored support with AI”, but good product design and robust testing remains critical.

The FCA states that it “will be conducting pilots with firms of AI models,” which probably means that an exercise assessing any consumer facing AI usage by firms is either under way or is imminent. Firms using AI or planning to use AI in this context should probably prepare for fielding queries from the FCA on the subject in the near future.