In partnership with

Hello,

This is Simon with the latest edition of The Weekly. In these updates, I share key AI related stories from this week's news, list upcoming events, and share any longer form articles posted on the website.

How much time are we checking that AI output is factual and accurate?

There is growing talk about a recent report by McKinsey which states that while 80% of companies now use generative AI, the same proportion say they have yet to see notable bottom-line results from those investments.

This observation is echoed by S&P Global, which suggests that 70–90% of enterprise AI initiatives fail to scale, and over 80% of AI projects never deliver their promised outcomes.

While the percentages are high, the overall sentiment is not surprising. From my time at Dataiku working with several insurance companies, I know there was a huge motivation to use generative AI, but it was matched with significant caution. In a highly regulated industry, getting anything wrong could lead to a major PR disaster and multi-million pound fines.

One of the main concerns was the accuracy of the output from generative AI solutions. It's well-noted that these tools can be prone to inventing information, which they present with complete confidence. This concept is known as 'hallucination'.

As an example, I gave a model the following prompt:

"when liverpool won the champions league final in 2024, who were the goal scorers?"

The response I received was factually incorrect:

Liverpool won the 2024 UEFA Champions League final with goals from Alexis Mac Allister and Cody Gakpo... The final score was Liverpool 2, Real Madrid 0.

Anyone who follows European football would instantly spot that Liverpool did not win the Champions League in 2024; therefore, the answer is an invention. Although I was complicit in this mistake by using a flawed prompt, it highlights how easily this could happen in a work setting where someone might ask a leading question.

This has been a real problem. In a now somewhat infamous case, two US attorneys were each fined $5,000 after using false information provided by ChatGPT, which cited six non-existent court cases in a legal brief.

Methods for Reducing Hallucinations

Several methods can be used to reduce hallucinations, though none offer a perfect solution on their own:

  • Retrieval-Augmented Generation (RAG): This technique grounds the AI model by having it retrieve information from a pre-defined, trusted knowledge base—like your internal company documents—before generating an answer.

  • Fine-tuning: This process further trains a general-purpose model on your own specific data, adapting it to your company's context and terminology.

  • Grounding in Documentation: A simpler approach where the AI is instructed to base its answers only on the context provided in the documents supplied with the prompt.

  • Prompt Design: The process of carefully structuring the input question to guide the AI towards a more accurate and constrained response.

In reality, organisations use a combination of most of these methods, alongside manual fact-checking. It might be this very complexity, the significant effort required to create, manage, and verify these productionised systems, that is slowing down adoption and confidence. It raises the question of whether the return is worth the investment, at least for now.

What steps do you take to minimise hallucinations in your work? Do you have a process to fact-check the output?

A Message From Our Sponsor

The future of AI customer service is at Pioneer

There’s only one place where CS leaders at the cutting edge will gather to explore the incredible opportunities presented by AI Agents: Pioneer.

Pioneer is a summit for AI customer service leaders to come together and discuss the trends and trajectory of AI and customer service. You’ll hear from innovators at Anthropic, Toast, Rocket Money, Boston Consulting Group, and more—plus a special guest keynote delivered by Gary Vaynerchuk.

You’ll also get the chance to meet the team behind Fin, the #1 AI Agent for customer service. The whole team will be on site, from Intercom’s PhD AI engineers, to product executives and leaders, and the solutions engineers deploying Fin in the market.

Real World Use Case

Exclusive for subscribers.

In this new section, I’m going to bring to you a real world example of AI use.

Subscribe to get access

Curated News

Google & PayPal Team Up on AI-Powered Commerce

Google and PayPal announced a multi-year partnership to infuse AI into checkout, payouts, and fraud prevention across Google properties and PayPal rails. Expect tighter integration of PayPal services inside Google Cloud, Play and more.

Why it matters: Merchants could see higher conversion and fewer chargebacks without stitching together extra tools—AI is being baked into the payments stack many businesses already use.

Zoom’s AI Companion Gets Serious

At Zoomtopia, Zoom rolled out AI Companion 3.0 with cross-app notes, action tracking, scheduling help, and customer-facing upgrades—plus options to build workflow-specific agents inside Zoom Workplace.

Why it matters: The value isn’t just meeting summaries—it’s shrinking the gap between “we discussed it” and “it’s done,” which means fewer status emails and faster cycle times.

Monday.com Brings AI Agents into Daily Work Hubs

monday.com announced new AI-powered agents across its Work OS and CRM suite aimed at chasing tasks, keeping boards current, and coordinating campaigns—within a tool many teams already live in.

Why it matters: Adoption jumps when AI sits where work happens; embedded agents can improve data hygiene, reduce status meetings, and keep projects moving without adding another app.

Upcoming AI Events

Thanks for reading, and see you next Friday.

Simon,

Was this email shared with you? If so subscribe here to get your own edition every Friday.

Enjoying Plain AI? Share it and get a free gift!

If you find this newsletter useful, why not share it with a couple of friends or colleagues who would also benefit? As a special thank you, when two people subscribe using your unique referral link, I’ll send you my "Exclusive Guide: Supercharge Your Work with NotebookLM." It’s a practical, no-nonsense guide to help you turn information overload into your secret weapon.

You can also follow me on social media:

Reply

or to participate

Keep Reading

No posts found