Menu

Managing the Challenges of Using the Contact Center Agent as the Human in the Loop

Assessing the Use of Contact Center Agents to Serve as Truth Arbiters for Generative AI

The News:

A lack of trust among both company executives and consumers in the ability to trust the output of generative AI represents a speed bump in its deployment, according to a recent Salesforce briefing to industry analysts. A key strategy being deployed by Salesforce to mitigate these trust issues is ensuring that any generative AI content is reviewed by a live human before it is put in front of a customer, essentially ensuring that there is always a human in the loop. Combined with the use of Salesforce’s Einstein GPT Trust Layer, which grounds large language model (LLM) data with pre-vetted company data held in Salesforce Cloud, the process is designed to make certain that generative AI content meets organizational standards for accuracy, completeness, and lack of bias before it reaches a customer.

Managing the Challenges of Using the Contact Center Agent as the Human in the Loop

Analyst Take:

Like many customer experience software providers, Salesforce is pushing ahead with the development of generative AI models to improve workflows and efficiencies in the contact center. However, both organizational leaders and consumers do not fully trust generative AI to return complete, accurate, and unbiased responses to prompts, so Salesforce is taking the approach of making sure that a live human will review all responses created by generative AI systems before they are presented to customers.

Grounding Models to Vetted and Constrained Data Sources

Salesforce will utilize a specific approach called grounding, which ensures that the generative AI model that is used to respond to a prompt only uses the information contained within Salesforce as a basis for its responses, which limits the number of hallucinations or otherwise incorrect responses. Grounding is designed to ensure that the generative AI models only use previous ticket data, customer data, and approved product or service information, and company knowledge bases as sources for generating responses.

One example comes from Gucci, which is one of about 30 or so customers that have been piloting generative AI features within Salesforce. Gucci was a key pilot for Salesforce’s contact center use case, which aimed to not only use generative AI to solve a single issue that came in, but also uncover specific upsell or cross-sell opportunities, with the goal of adding value to a customer’s shopping cart.

By grounding the generative AI model with all Gucci product information, not just knowledge articles, the contact center agent was able to offer product recommendations provided by the generative AI model as they were solving the case. According to Salesforce, Gucci saw a 14% increase globally in productivity from a cross-sell/upsell standpoint. Further, by grounding the model with customer data, product data, data from the knowledge base, and previous trouble ticket data, accuracy was improved dramatically, with very minimal instances of model hallucinations (Salesforce claims that roughly two instances of the model hallucinating occurred out of thousands of responses that were generated).

Choosing the Right Contact Center Agents to Serve as Truth Arbiters for GenAI

The Gucci pilot was also successful because once a response has been generated by a generative AI model, a contact center agent would serve as the live human in the loop to ensure only approved content is sent to a customer. The agent is sent the output from a generative AI model, and then can send it along as is, or edit it, or choose to ignore it and send along their own response.

Salesforce says that its pilot customers are currently taking the approach of using only more experienced agents to utilize generative AI-created responses, so they would have the depth of experience and knowledge to quickly understand if a response was accurate or complete.

The question of which live human will assume responsibility for validating generative AI responses will be the key to its successful deployment in the contact center. While Gucci was focused on using more experienced agents to run the generative AI pilot, there is still risk involved. Some experienced agents still may not be completely familiar with all facets of a company’s products, service, or policies, and given the time constraints agents face for responding to customers, it may be too easy to simply accept the response from the generative AI model as valid.

This may have the potential to impact the accuracy and validity of company-specific models in the future, which may be used to power fully self-service models, without a live human in the loop. If an inaccurate or incomplete response is upvoted or approved by human contact center agents without review by higher-level staff, these inaccuracies may be inadvertently burned into the model via reinforcement learning.

Organizations need to be very mindful of the time, knowledge, and motivational constraints that can occur when entrusting frontline agents to handle the data verification for generative AI content. Letting agents tune these models in real time must be accompanied by frequent log reviews to ensure that the information and input they are providing to the models are accurate and complete.

Author Information

Keith Kirkpatrick is VP & Research Director, Enterprise Software & Digital Workflows for The Futurum Group. Keith has over 25 years of experience in research, marketing, and consulting-based fields.

He has authored in-depth reports and market forecast studies covering artificial intelligence, biometrics, data analytics, robotics, high performance computing, and quantum computing, with a specific focus on the use of these technologies within large enterprise organizations and SMBs. He has also established strong working relationships with the international technology vendor community and is a frequent speaker at industry conferences and events.

In his career as a financial and technology journalist he has written for national and trade publications, including BusinessWeek, CNBC.com, Investment Dealers’ Digest, The Red Herring, The Communications of the ACM, and Mobile Computing & Communications, among others.

He is a member of the Association of Independent Information Professionals (AIIP).

Keith holds dual Bachelor of Arts degrees in Magazine Journalism and Sociology from Syracuse University.

Latest Insights:
MediaTek Analyst Day 2026 - Is the New MediaTek Ready to Move Upmarket to AI PCs and Data Center
April 6, 2026
Article
Article

MediaTek Analyst Day 2026 – Is the New MediaTek Ready to Move Upmarket to AI PCs and Data Center?

Brendan Burke and Olivier Blanchard, Research Directors at Futurum, share insights on MediaTek Analyst Day 2026, analyzing their transition to a B2B powerhouse and the innovation behind their aggressive $1B data center...
Glean Doubles ARR to $200M. Can Its Knowledge Graph Beat Copilot
April 3, 2026
Article
Article

Glean Doubles ARR to $200M. Can Its Knowledge Graph Beat Copilot?

Nick Patience, VP & Practice Lead at Futurum, examines Glean's platform evolution from enterprise search to agentic AI, as it doubles ARR to $200M and battles Microsoft 365 Copilot for enterprise knowledge...
HP IQ Finally Brings Useful On-Device AI To Workspaces
April 3, 2026
Article
Article

HP IQ Finally Brings Useful On-Device AI To Workspaces

Olivier Blanchard, Research Director at Futurum, shares insights on HP IQ, HP’s workplace intelligence layer combining on-device AI, proximity-based connectivity, and IT control across devices and workflows....
RSAC 2026: The AI 'Tragedy of the Commons' and the Future of Agentic Security
April 3, 2026
Article
Article

RSAC 2026: The AI ‘Tragedy of the Commons’ and the Future of Agentic Security

Fernando Montenegro and Mitch Ashley, VPs and Practice Leads at Futurum, convey their observations from the RSAC 2026 Conference, with a focus on AI and agentic security....
Latest Research:
From Pipeline to Financial Impact: The Business Economic Value of SAP Sales Cloud
April 2, 2026
Research
Research

From Pipeline to Financial Impact: The Business Economic Value of SAP Sales Cloud

In our latest BEV Report, From Pipeline to Financial Impact: The Business Economic Value of SAP Sales Cloud, completed in partnership with SAP, Futurum Research examines how SAP Sales Cloud...
The Foundation for Innovation Why Architectural Integrity and Distributed Databases are Crucial for Scaling Mission-Critical, AI-Ready Applications
March 29, 2026

The Foundation for Innovation: Why Architectural Integrity and Distributed Databases are Crucial for Scaling Mission-Critical, AI-Ready Applications

In our latest report, The Foundation for Innovation: Why Architectural Integrity and Distributed Databases are Crucial for Scaling Mission-Critical, AI-Ready Applications, commissioned by Oracle, Futurum Research examines two contrasting approaches...
From Proof of Concept to Inference ROI Overcoming the Five Failure Modes of Production AI with Nebius Token Factory
March 24, 2026

From Proof of Concept to Inference ROI: Overcoming the Five Failure Modes of Production AI with Nebius Token Factory

In our latest report, From Proof of Concept to Inference ROI: Overcoming the Five Failure Modes of Production AI with Nebius Token Factory, completed in partnership with Nebius, Futurum Research...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.