AI might not dream, but it does hallucinate: What wrong answers can tell you about choosing the right LLM

Sacha Cody, chief operating officer at BrandComms.AI, Forethought, addresses the pressing issue of AI “hallucinations”.

SACHA CODY

November 29, 2024 8:15

Who knew The Wachowski’s dystopian sci-fi The Matrix would actually set the scene for the AI landscape some 50 years on?

As AI becomes an ever-present part of our daily lives, questioning and scepticism of its ability – and more recently, whether it all exists in a bubble – has become commonplace. But perhaps the most pressing issue is the phenomenon of AI “hallucinations.”

These are more than just the same cat passing by twice, but instead are instances where AI models generate outputs that may sound plausible but are, at their best, incorrect. At their worst they are downright nonsensical. Understanding and mitigating AI hallucinations is crucial for ensuring the reliability and credibility of a brand or business’ AI-generated content.

For brands relying on AI to drive their marketing strategies, AI hallucinations can pose significant risks. Incorrect or misleading information can damage brand credibility, lead to ineffective campaigns, and erode consumer trust. Mitigating risk means time spent upfront understanding what drives hallucinations and what creates them.

Hallucinations occur when a GenAI model produces information that is not grounded in its training data or is logically inconsistent. This can happen due to the model’s tendency to generate responses based on patterns and probabilities rather than factual accuracy.

It’s because, at face-value, LLMs are designed to predict the next word in a sequence based on the context provided. They are often optimised to provide responses that appear coherent and satisfactory to users. In their quest to “please” users, they may generate answers even when they lack sufficient information, leading to hallucinations.

Similarly, AI models trained on vast and varied datasets can sometimes overgeneralise, applying learned patterns inappropriately across different contexts. All of this is to say: without proper safeguards and high-quality training data, these models can generate outputs that are misleading or entirely false.

So, what are the sort of green flags you should look for in a product to help mitigate hallucinations and other flaws in an AI product?

Custom content stores: Unlike generic AI models, having your AI model draw on a custom content store that is trained on proprietary data sets relevant to specific brands and industries will assist ensure the generated content in both accuracy and contextual appropriation.

Rigorous testing and validation: Gen AI models need to implement extensive testing and validation processes to ensure that outputs meet high standards of accuracy and relevance. This includes pre-testing content to detect and correct any anomalies.

Expert oversight: Platforms should be monitored by a team of experts who are providing human oversight through prompt engineering, quality assurance on input data, as well as things like output validation.

Over the next year, the AI storm will continue to transform the marketing landscape, so understanding where platforms go wrong and fall short is essential to ensuring your brand is not only safe from but capitalising on GenAI. Ensuring accurate and relevant outputs not only enhances marketing efficacy but also builds greater trust and confidence in AI technologies.