Managing the Challenges of Using the Contact Center Agent as the Human in the Loop

Assessing the Use of Contact Center Agents to Serve as Truth Arbiters for Generative AI

The News:

A lack of trust among both company executives and consumers in the ability to trust the output of generative AI represents a speed bump in its deployment, according to a recent Salesforce briefing to industry analysts. A key strategy being deployed by Salesforce to mitigate these trust issues is ensuring that any generative AI content is reviewed by a live human before it is put in front of a customer, essentially ensuring that there is always a human in the loop. Combined with the use of Salesforce’s Einstein GPT Trust Layer, which grounds large language model (LLM) data with pre-vetted company data held in Salesforce Cloud, the process is designed to make certain that generative AI content meets organizational standards for accuracy, completeness, and lack of bias before it reaches a customer.

Managing the Challenges of Using the Contact Center Agent as the Human in the Loop

Analyst Take:

Like many customer experience software providers, Salesforce is pushing ahead with the development of generative AI models to improve workflows and efficiencies in the contact center. However, both organizational leaders and consumers do not fully trust generative AI to return complete, accurate, and unbiased responses to prompts, so Salesforce is taking the approach of making sure that a live human will review all responses created by generative AI systems before they are presented to customers.

Grounding Models to Vetted and Constrained Data Sources

Salesforce will utilize a specific approach called grounding, which ensures that the generative AI model that is used to respond to a prompt only uses the information contained within Salesforce as a basis for its responses, which limits the number of hallucinations or otherwise incorrect responses. Grounding is designed to ensure that the generative AI models only use previous ticket data, customer data, and approved product or service information, and company knowledge bases as sources for generating responses.

One example comes from Gucci, which is one of about 30 or so customers that have been piloting generative AI features within Salesforce. Gucci was a key pilot for Salesforce’s contact center use case, which aimed to not only use generative AI to solve a single issue that came in, but also uncover specific upsell or cross-sell opportunities, with the goal of adding value to a customer’s shopping cart.

By grounding the generative AI model with all Gucci product information, not just knowledge articles, the contact center agent was able to offer product recommendations provided by the generative AI model as they were solving the case. According to Salesforce, Gucci saw a 14% increase globally in productivity from a cross-sell/upsell standpoint. Further, by grounding the model with customer data, product data, data from the knowledge base, and previous trouble ticket data, accuracy was improved dramatically, with very minimal instances of model hallucinations (Salesforce claims that roughly two instances of the model hallucinating occurred out of thousands of responses that were generated).

Choosing the Right Contact Center Agents to Serve as Truth Arbiters for GenAI

The Gucci pilot was also successful because once a response has been generated by a generative AI model, a contact center agent would serve as the live human in the loop to ensure only approved content is sent to a customer. The agent is sent the output from a generative AI model, and then can send it along as is, or edit it, or choose to ignore it and send along their own response.

Salesforce says that its pilot customers are currently taking the approach of using only more experienced agents to utilize generative AI-created responses, so they would have the depth of experience and knowledge to quickly understand if a response was accurate or complete.

The question of which live human will assume responsibility for validating generative AI responses will be the key to its successful deployment in the contact center. While Gucci was focused on using more experienced agents to run the generative AI pilot, there is still risk involved. Some experienced agents still may not be completely familiar with all facets of a company’s products, service, or policies, and given the time constraints agents face for responding to customers, it may be too easy to simply accept the response from the generative AI model as valid.

This may have the potential to impact the accuracy and validity of company-specific models in the future, which may be used to power fully self-service models, without a live human in the loop. If an inaccurate or incomplete response is upvoted or approved by human contact center agents without review by higher-level staff, these inaccuracies may be inadvertently burned into the model via reinforcement learning.

Organizations need to be very mindful of the time, knowledge, and motivational constraints that can occur when entrusting frontline agents to handle the data verification for generative AI content. Letting agents tune these models in real time must be accompanied by frequent log reviews to ensure that the information and input they are providing to the models are accurate and complete.

Author Information

Keith has over 25 years of experience in research, marketing, and consulting-based fields.

He has authored in-depth reports and market forecast studies covering artificial intelligence, biometrics, data analytics, robotics, high performance computing, and quantum computing, with a specific focus on the use of these technologies within large enterprise organizations and SMBs. He has also established strong working relationships with the international technology vendor community and is a frequent speaker at industry conferences and events.

In his career as a financial and technology journalist he has written for national and trade publications, including BusinessWeek,, Investment Dealers’ Digest, The Red Herring, The Communications of the ACM, and Mobile Computing & Communications, among others.

He is a member of the Association of Independent Information Professionals (AIIP).

Keith holds dual Bachelor of Arts degrees in Magazine Journalism and Sociology from Syracuse University.


Latest Insights:

The Six Five team discusses Oracle Q4FY24 earnings.
The Six Five team discusses enterprise SaaS reset or pause
The Six Five team discusses Six Five Summit 2024 wrap.

Latest Research:

In our latest Research Brief, Fortifying Operational Technology (OT) Systems Against Cyberattacks–done in partnership with Honeywell– we examine the benefits of a comprehensive strategy for protecting OT assets against cyberattacks requiring asset discovery, ongoing risk assessment, and compliance management.
The Futurum Group’s Research Brief, Unlocking AI Potential: How HPE Private Cloud AI Accelerates AI Deployment and Innovation, completed in partnership with HPE and NVIDIA, delves into the complexities of AI deployment and the solutions offered by HPE's Private Cloud AI.
In our latest research brief, Intel AI Everywhere: Ready to Transform the AI Ecosystem, we analyze why Intel is perfectly suited to pace the AI Everywhere proposition. We look at why Intel is fundamentally committed to the core proposition of bringing AI everywhere, through key offerings such as Intel Xeon processors, Gaudi accelerators, and Intel Core Ultra Processors, which are aimed at ushering in the age of AI PC and securely distributing AI workloads in data center, cloud, and edge environments.