Lawsuits and Probes, How OpenAI & Microsoft Are Impacting the Trajectory of AI | The AI Moment – Episode 11

Lawsuits and Probes, How OpenAI & Microsoft Are Impacting the Trajectory of AI | The AI Moment – Episode 11

On this episode of The AI Moment, we discuss an emerging trend in generative AI– lawsuits and probes, and how OpenAI & Microsoft are impacting the trajectory of AI.

OpenAI and Microsoft seem to be at the center of discussions about AI monopolies and copyright fights. Last week the EU said they are considering opening an investigation into whether OpenAI and Microsoft’s partnership agreement is a potential monopoly. Separately, The New York Times sued the pair claiming copyright infringement when ChatGPT was trained on NYT content. I’ll discuss why I think the EU investigation will go nowhere and conversely, why the NYT suit is critical to better LLM outcomes.

Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.

Listen to the audio here:

Or grab the audio on your favorite podcast platform below:

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this webcast.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.


Mark Beccue: Hello, I’m Mark Beccue, Research Director of AI with The Futurum Group. Welcome to The AI Moment, our weekly podcast that explores the latest developments in enterprise AI. And we literally are in a moment, aren’t we? The pace of change in innovation and AI is unprecedented. I’ve been covering AI since 2016, never seen anything like what we’ve experienced since ChatGPT launched in late 2022 and kickstarted the generative AI era. And with our podcast, The AI Moment, we tried to distill the mountain of information, separate the real from the hype, and provide you with sure-handed analysis about where the AR market will go. We’ll dive deep into the latest trends, technologies and other things that are shaping the AI landscape. And we’ll cover things from analyzing the latest advancements in the technology, parsing the mutating vendor landscape to things like AI regulations, ethics, risk management, and more. Matter of fact, today we’re going to talk about something a little different in terms of lawsuits and those kinds of things.

So the more includes those sorts of things, and each show typically can be made up of two to three different kinds of segments. Anything from a guest spotlight to key trends in generative AI, which we’re going to cover today, to one of our favorites, adults in the generative AI rumpus room, where we talk about people who are being responsible, or anything like we just talk about a company that we like doing AI. So those are our rolling regular segments. Like I said, today we’re going to cover an area I call… I’m going to call this Lawsuits and Probes: How OpenAI and Microsoft are Impacting the Trajectory of AI. So this is really keys on two different initiatives, two different things that are going on right now. One, there’s an EU probe into the partnership between Microsoft and OpenAI in terms of a monopoly or an anti-competition type of thing. And the second is a lawsuit that we all know about from the New York Times that has sued OpenAI and Microsoft for using their data without permission or compensation.

So let’s talk about that. We’re going to go into the EU probe first. I want to give you where that came from and how that works. So let’s look at this. And this happened, the European Commission, it’s the executive arm of the EU, they said this about a week ago. It was around the 9th I think. Yeah, they said that they are thinking about opening an investigation into what they’d say a merger investigation of OpenAI and Microsoft. And they think that there’s a possibility that they want to understand how competitive the market is for the types of services that OpenAI is offering. And their thinking, they said, the quote is, “Looking into some of the agreements that have been concluded between large digital market players and generative AI developers and providers,” they singled out Microsoft and OpenAI, the tie-up for that as particular deal that they’ll be studying. Here’s a quote, “The European Commission is checking whether Microsoft’s investment in OpenAI might be reviewable under the merger regulation of the EU.”

So that’s where we stand on this. So let’s talk about that for a bit and what I think is going on there or what my thoughts on it were is this. First, I think that this kind of investigation seems really premature. And the first reason I think that is that you’d have to say, how does this partnership hurt competition? How does it hurt competition? And we’ve talked on the show a little bit before. I think I mentioned that we’ve seen price reductions in LLM services. So Anthropic announced one last month back in December that they were actually dropping the price for Claude 2. That’s the ChatGPT equivalent. And why is that price drop? Well, they talked about competition, and especially the thought going here is that we are seeing this time where we’ve talked about this before, that we’re now seeing more of these open source based large language models, from everything from Meta’s Llama models to a lot of different other ones, newly launched, one from Mistral AI is getting a lot of attention. We talked about that here. And even Microsoft has launched what I think is a really fabulous new small model. They have what’s called Phi-2 and it’s open source. It’s only for research at this point, but it is a model out there for that. There’s others like MosaicML, it’s now part of Databricks. They have one called MPT. It’s a seven-parameter size model. So where I’m going with this is I don’t understand why the EU and the… This is just a premature piece of investigation because to me, it certainly looks like there’s competition in the marketplace for these things. So I’m going to set that aside, saying that’s number one.

The second part about this, and I think that it brings up around the probe, is that I believe there’s a good chance, and I’m not alone in this, many market watchers are thinking that, there’s a pretty good chance that AI models are going to become commoditized. And what that does is it points you in a different direction, and I’ll talk about that for a minute. Many of the wisest folks out there have been saying for some time that it’s not the model that is going to unlock value for companies and organizations. It’s how they use it and leverage it against their own data. And so let’s think about that. When you think about this mutation that we’re seeing with these models, explosion of open source, we just think that the value and the revenue that any vendors are going to make out of this is really more not selling these models, but rather helping these enterprises manage and leverage their data to do something with AI. So that’s not the competitive differentiator for them. It’s going to be how they leverage it. And there are lots of different players in this space, which I’d say points us toward… I’m going off a little bit of the track, but to say the EU’s talk worried about competition. I don’t see it because I think this is going to be commoditized, but I also think that the big opportunity in this space for lots of different vendors to help companies is around data management, data orchestration, data federation, however you want to call it. And those are companies that build data lakes and there’s lots of different things.

So they do data management. But the ones that I would say that makes sense in this space are players like Databricks, AWS, Google, Microsoft of course, and then others like Snowflake, even Dell and HPE are helping their customers manage and bring the AI to the data is the way they talk about it. So I think that’s an important point to think about. And then what I think the third part to me for this probe and why I think it’s just a waste of time frankly, is that I would say that Microsoft has been an AI leader way before they got together with OpenAI, and they’re really expanding their value without OpenAI. So we’ve talked about that a little bit. I’ve written about this where Microsoft… A couple of things. One, they’re clearly one of the pioneers of AI. They published, and I looked this up, they have published more AI research papers than any entity between 2010 and 2023, more than anyone.

What does that tell you? Well, it means that they’re invested, right? And they’ve been building these models and producing what I think are some of the most intriguing new models out there in the market, including Phi-2, which we mentioned earlier, and one called Orca 2. And I think that they are, and they must continue to leverage their own AI expertise. And there’s other things that just point that in that direction. They have what they call the Azure studio for AI developers that offers a wide range of models, including several open source ones. So all of these things to me, point to just it being not something that I believe the market needs. No one’s cornered the market on AI models. It’s just way too early for this. And I’d even say this, it’s when you think about pricing and how things are used for open source and what we’ve talked about here before where you’re getting all of this movement towards much smaller models, to me it seems like a waste of time. So that’s it on the EU probe.

Let’s talk about the New York Times for a little bit very shortly. I was on a show yesterday. I’ve been talking about this on and off through the media for a bit. And the New York Times, just in brief, how we review this real quick, but they have sued OpenAI and Microsoft and they are seeking an end to the practice of using New York Times’ stories to train their models. And they say, the Times says, that the two companies are threatening its livelihood by effectively stealing billions of dollars worth of work by its journalists, in some cases spitting out Times material verbatim to people who seek answers from gen AI like ChatGPT. It’s filed. It’s filed in federal court. And there’s a lot of things here about fair use and whether that’s going to enter into this or not.

And here’s where I just have a couple of thoughts here. One would be, we talked about this last week when we were talking about digital watermarking and what media companies are doing to protect their data. And so this has a lot to do with that. And the first question is, how do media protect their content going forward? So we’re sitting here at a point in time and the Times is looking backwards at OpenAI and Microsoft saying, “You used this in the past.” So there’s the, “How do I get relief from what happened here and how do I stop it moving forward?” There’s two different pieces. So how do they protect? And I think that it’s really important that the New York Times has a significant revenue stream from paid content. And I’m not a lawyer, I don’t understand fair use or those kinds of things, but how paid content behind a paywall is more protected against being reused. I’ve heard from a lawyer yesterday that if that content gets republished, let’s say through tweets or quotes or things like that, that’s how they’re getting around the copyright issue of getting paid behind a paywall kind of thing.

But it’s interesting to me that what we’re looking at now is like we talked about last week, there’s going to be these watermarking and other methods to restrict the use of data for training. So I think we’re at that point, the media companies are either going to get paid for doing it or they’re going to have the ability through technical means to stop AI model vendors from training on their data. So a big thing here I think is I would say is I believe the New York Times has an awful lot to lose. I don’t think they’re going to back down without some sort of resolution. I said this yesterday in the show. I don’t think this is going to go all the way to court. I think it’s going to be in the best interest of OpenAI and Microsoft to settle. But my point is there’s a lot for these media companies to lose, and I don’t think they’re going to back down without feeling that they’re made whole. So what that does is it gives you a couple of different ideas here that we have to think about. One would be how important will these massive data sets be going forward? So if we’re going to fight over this training, which is used on these massive models, use tons and tons of scrape, web scrape that they get, how important will that be? And there’s so many issues with those massive data sets. We talked about this before, bias, misinformation, disinformation, hallucination.

And the trend, like I said a little bit earlier, is towards these smaller, more specialized models. There’s some issues there as well, but I guess where I’m looking at this is saying one of the things that is going to be an issue in the market is how big are these models? And I think on both sides, we’re going to see a way that maybe that they’re not as big. There’s certainly some reasons why you want those to be more accurate, to move towards getting rid of junkie data that’s in these sets and get better data. So that might mean just they’re smaller. So that’s one last, and then one last thing here I’ll talk about real quick is responsible use issues. I think there’s a possibility that the market, and we talked about this a little bit last week with deep fakes, is eventually the market might sour on these LLMs that don’t reveal the source of their training data. And the reason for that is when they make these mistakes and there’s misinformation and disinformation, and when that’s traceable to source data, then you have some resolutions, right? “Well, why did it say this?” Well, got it from here. That’s the way that looks.

There’s been some movement towards that where you tag metadata, you’re able to see what the model was trained on, and that’s called traceability or accountability. Transparency in AI is a movement. So this idea that you have this transparency by the LLMs to their source data, and what it does at the end is it also gives these data owners more leverage to say yes or no or to sue. So really, that’s what we’ve got going today, wanted to talk about. I appreciate it. It’s a short one, a little shorter today. Maybe you guys appreciate that. I don’t know. I think so. I thank you for joining me here on The AI Moment. Be sure to subscribe, rate, review the podcast on your preferred platform. We’re on all of them, including YouTube. So thanks again and we’ll see you next time.

Other Insights from The Futurum Group:

Watermarking & Other Strategies for Licensing AI Training Data & Combating Malicious AI Generated Content | The AI Moment – Episode 10

2023 AI Product of the Year, AI Company of the Year | The AI Moment, Episode 9

Adults in the Generative AI Rumpus Room: The Best of 2023 | The AI Moment, Episode 8

Author Information

Mark comes to The Futurum Group from Omdia’s Artificial Intelligence practice, where his focus was on natural language and AI use cases.

Previously, Mark worked as a consultant and analyst providing custom and syndicated qualitative market analysis with an emphasis on mobile technology and identifying trends and opportunities for companies like Syniverse and ABI Research. He has been cited by international media outlets including CNBC, The Wall Street Journal, Bloomberg Businessweek, and CNET. Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.


Latest Insights:

Azure for Operators Unveils the General Availability of Azure Operator Nexus Aimed Primarily at Running Mobile Workloads on Azure to Deliver Breakthrough CX
The Futurum Group’s Ron Westfall examines why the general availability of Azure Operator Nexus exemplifies Azure for Operator’s strategic commitment to empowering telecom operators with security, performance, and efficiency innovation.
On this episode of The Six Five In the Booth, hosts Daniel Newman and Patrick Moorhead welcome Dan Kusel, GM and Managing Partner at IBM and Usman Zafar, Assistant Vice President, Product Management & Development at AT&T at MWC 2024 for a conversation on the influence generative AI has on transforming the telecom industry.
On this episode of The Six Five – Insider, hosts Daniel Newman and Patrick Moorhead welcome Walter Sun, Global Head of AI at SAP for a conversation on SAP’s AI strategy.
The Futurum Group’s Steven Dickens and Sam Holschuh share their insights on the transformation of the data management and analytics industry along with Snowflake’s announcement of a new CEO.