Menu

Google I/O 2023: PaLM 2 Debut Shows Language Model Progress Although Toxicity, Economic and Environmental Concerns Abound

The News: Google introduced PaLM2, the company’s next generation language model, at Google I/O 2023. PaLM 2 is a language model that Google promotes as having improved multilingual, reasoning, and coding capabilities over PaLM 1. Read the blog from Google here.

Google I/O 2023: PaLM 2 Debut Shows Language Model Progress Although Toxicity, Economic and Environmental Concerns Abound

Analyst Take: Google is emphasizing that it has added new multilinguality, reasoning, and coding advances aimed at making PaLM 2 more capable, faster, and efficient than previous models, playing a major role in the complete Google I/O 2023 marketing push. PaLM 2 also comes in a diverse array of sizes, targeted at making it easier to deploy for a wide range of use cases.

Now PaLM 2 is more trained on multilingual text, spanning more than 100 languages to boost understanding, generation, and translation of nuanced text such as idioms, poems, and riddles. Plus, PaLM 2’s dataset includes scientific papers and web pages that use mathematical expressions to improve logic, common sense reasoning, and mathematics. For coding, PaLM 2 was pre-trained on a large quantity of publicly available source code datasets, including programming languages such as JavaScript and Python as well as generate specialized code in languages like Prolog and Fortran.

From my view, Google needed to unveil PaLM 2 enhancements to counter Microsoft-backed OpenAI’s GPT-4 language model offering as Microsoft continues to ride the sales and marketing momentum gained from its AI-powered Bing and Edge debut in February. This includes Google enhancing its Language Model for Dialog Applications (LaMDA) so that Google Bard, which uses AI to generate more conversational, contextual, and informative web search results for users, can improve web search by drawing on information across the Internet to provide deeper, mode contextual query results for users.

Specifically, Google heralded over 25 new products and features powered by PaLM 2 including expanding Bard to support new languages. Users can now use Workspace to write in Gmail and Google Docs, as well as organize Google Sheets. I find encouraging that Sec-PaLM 2, a specialized version of PaLM 2, is trained on security use cases and can provide potentially invaluable breakthroughs in cybersecurity analysis.

Refreshingly, Google coupled the PaLM 2 launch with a research paper that revealed and underlined some of the notable limitations of the model. For instance, the paper did not crystallize the data sources used to train PaLM 2, beyond broad categories like web documents, mathematics, conversational data, and books.

However, Google does stress that the PaLM 2 dataset pools from a larger percentage of non-English data and a broader dataset than the PaLM 1 dataset. I anticipate that Google and Microsoft will continue to tightly disclose and mask their respective data sources as competition intensifies throughout the generative AI segment as well as the general AI realm.

Moreover, when fed overtly toxic prompts, such as violent and pornographic content, PaLM 2 generated toxic responses over 30% of the time and proved even more toxic when fed implicitly harmful prompts with a 60% response rate.

While PaLM 2 showed improvement over PaLM 1 in areas such as joke explanation, support for a wider range of language and dialect conversion, and mathematical aptitude, I find that large language models still need adult supervision and a good deal of augmentation before becoming consistently trusted sources of knowledge.

I also believe Google needs to directly address the economic and environmental dimensions of LLM and generative AI technology. For example, the daily cost of running ChatGPT is apparently a staggering $700K (according to The Information’s findings). While access to a fully itemized breakdown of the $700K daily bill is not readily available, it’s not difficult to deduce that a substantial portion is spent on energy due to the high-powered servers, GPUs, and massive storage capacities used in AI applications.

Key Takeaways: Google PaLM 2 Shows Language Model Training is a Long and Winding Road

Overall, I expect that Google DeepMind research, buttressed by Google’s vast computational resources, can frequently deliver new capabilities that improve the experience of using Google products. I also believe that Google PaLM 2, while showing improvements in key areas, remains very much a work in progress due to considerations such as alarmingly high toxicity rates as well as the economic and environmental implications of scaling language models and generative AI.

Google needs to rapidly address the full range of concerns to assure sustained PaLM 2 progress or risk economic and environmental considerations acting as major constraints on the longer-term ecosystem impact of PaLM 2 and language models in general.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

Alphabet Announces Q1 FY23 Results: Search and Cloud Lift Performance as Google Preps for AI Battles

The Battle for AI Domination Continues after Latest Google Announcement

Google Invests $300mn in Artificial Intelligence Start-Up Anthropic, Taking on ChatGPT

Author Information

Ron is an experienced, customer-focused research expert and analyst, with over 20 years of experience in the digital and IT transformation markets, working with businesses to drive consistent revenue and sales growth.

Ron holds a Master of Arts in Public Policy from University of Nevada — Las Vegas and a Bachelor of Arts in political science/government from William and Mary.

Related Insights
Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum
January 30, 2026

Meta Q4 FY 2025 Results Underscore AI-Fueled Ads Momentum

Futurum Research analyzes Meta’s Q4 FY 2025 earnings, focusing on AI-driven ads gains, stronger Reels and Threads engagement, and how 2026 infrastructure spend and messaging commerce shape enterprise AI strategy....
IBM Q4 FY 2025 Software and Z Cycle Lift Growth and FCF
January 30, 2026

IBM Q4 FY 2025: Software and Z Cycle Lift Growth and FCF

Futurum Research analyzes IBM’s Q4 FY 2025, highlighting software acceleration, the IBM Z AI cycle, and AI-driven productivity and M&A synergies supporting margin expansion and higher FY 2026 free cash...
ServiceNow Q4 FY 2025 Earnings Highlight AI Platform Momentum
January 30, 2026

ServiceNow Q4 FY 2025 Earnings Highlight AI Platform Momentum

Futurum Research analyzes ServiceNow’s Q4 FY 2025 results, highlighting AI agent monetization, platform consolidation in CRM/CPQ, and a security stack aimed at scaling agentic AI across governed workflows heading into...
Microsoft Q2 FY 2026 Cloud Surpasses $50B; Azure Up 38% CC
January 30, 2026

Microsoft Q2 FY 2026: Cloud Surpasses $50B; Azure Up 38% CC

Futurum Research analyzes Microsoft’s Q2 FY 2026 earnings, highlighting AI-led cloud demand, agent platform traction, and Copilot adoption amid record capex and a substantially expanded commercial backlog....
Will Acrobat Studio’s Update Redefine Productivity and Content Creation
January 29, 2026

Will Acrobat Studio’s Update Redefine Productivity and Content Creation?

Keith Kirkpatrick, VP and Research Director at Futurum, covers Adobe’s Acrobat Studio updates and provides his assessment of how this will impact the use of software to manage and automate...
Teradata Set to Turn Data Gravity Into AI Gold With Enterprise AgentStack
January 29, 2026

Teradata Set to Turn Data Gravity Into AI Gold With Enterprise AgentStack

Brad Shimmin, Vice President and Practice Lead at Futurum, analyzes Teradata’s launch of Enterprise AgentStack. He explores how Teradata is leveraging data gravity and robust governance to bridge the "production...

Book a Demo

Newsletter Sign-up Form

Get important insights straight to your inbox, receive first looks at eBooks, exclusive event invitations, custom content, and more. We promise not to spam you or sell your name to anyone. You can always unsubscribe at any time.

All fields are required






Thank you, we received your request, a member of our team will be in contact with you.