We used to say that technology was moving quickly. Now, it’s racing with a mind of its own.
The truth is, over the past few years, we’ve been talking about the same types of technology at the enterprise level fairly consistently. Automation, data, cloud, virtual reality, AI. Most companies have been dabbling in some if not all of these larger tech “buckets” for a while now—some more than others. Yes, there’s been a lot of advancement. But for the greater part of a decade, we’ve been talking about the same exact tech things. Then suddenly, with the introduction of generative AI, that’s all changed.
The AI Wave Is Here: Are You Ready?
Last year, at Thanksgiving dinner, no one was even talking about generative AI. Now, just 9 months later, it’s almost the only thing we as a society want to talk about. You can make art. You can write letters. You can take the bar exam. Kids are using it. Businesses are using it. In fact, the 2023 Cohesity Data Security & Management survey found that amongst the 3409 IT and Security leader respondents, automated AI/ML capability was the leading method of identifying sensitive information – such as personally identifiable information (PII) – within their data landscapes, with 44% leveraging AI and ML to manage sensitive data.
Now, generative AI is being layered over almost all of the bigger tech buckets I just mentioned above—the cloud, analytics, and of course data. It isn’t a standalone thing. It’s a super-force steroid that can be added to almost everything using the power of data and large language models (LLMs). When it comes to generative AI, the cat is out of the bag, so to speak. And it has been let out before any single one of us has had a chance to determine how generative AI stands to impact our corporate security, our ability to govern our sensitive data, and the protection of our trade secrets.
Yes, AI has caught on like wildfire. But we as a society don’t have a fire department to deal with it yet. Recently, I had the opportunity to participate in an episode of Spotlight on Security with Cohesity. I was able to speak with Dimitri Sirota, CEO and co-founder of BigID, a leader in data security, about the AI wave we are currently experiencing and the risks that come with it.
Generative AI: Risks and Rewards
The question we need to at least pause and think about as we jump into the AI tide is this: what are the risks and rewards of using AI? No, it’s not a novel question. But it’s one that few of us have taken the time to not just ponder, let alone build a safe and productive AI foundation upon. Dmitri put it this way: data represents value, but it also represents risk. And right now, we need to consider risk not just risk to society—there are tons of folks already discussing that. We need to discuss risk to our security from our sharing of data so liberally.
The way generative AI and large language models work is that you give them tons of data—all the data you possibly can. The more data the LLMs have, the better able the generative AI is to make sense of the world around it. That’s why from an enterprise system, for instance, it’s very tempting to give these LLMs all the data we own. The more we feed them, the smarter they will be, and the better/smarter output they will provide.
But how do we know that the information we’re feeding into our LLMs—especially those things like trade secrets and other sensitive information—will be safe? How do we know what’s valuable enough to share, and what’s possibly too valuable to risk sharing?
Dangers of LLM Democratization: The Data Swamp Rides Again
Many of us think we know what’s in our data, but we really don’t. That’s especially dangerous as generative AI access becomes more and more democratized because if we don’t really know our data—inside out, front and back—we don’t know what we’re creating with it.
For instance, one of the best things about LLMs is that you can feed them unstructured data—things like meeting notes and transaction data and security data. But oftentimes, we as business leaders don’t truly realize how much confidential information is active in those unstructured data pools. Because it’s unstructured, it’s difficult to search, secure, or redact. In turn, it’s not just possible but likely that many businesses will share sensitive information with their LLMs without realizing it.
Another thing: strong LLMs require sparkling clean data. Any incorrect data will throw off the learning that the AI is trying to do. Especially in unstructured data, which is incredibly difficult to scrub, this can cause large problems. Suddenly, you’re relying on a tool to create insights, but those insights are working from data that isn’t as smart as you think it is.
Generative AI: Slowing the Risk, Enhancing the Reward
Clearly, generative AI isn’t going anywhere. The question now is not how to stop it, but how to slow the risk. First, we need to know our data better. We need to create a more controlled data environment. We need to fully audit the information we have—index it, look at it from all directions, and find out where the holes—and value—are. And we need to make sure the data we make available to LLMs is only the data we want to be trained and seen by our audience, whoever that may be. Again, having a clean data pool isn’t a new idea. But having a clean data pool that is secured and accessible only to those who need it is even more imperative now than ever before.
We have no industry-standard process yet for slowing the risks associated with generative AI. We’re in the midst of asking the question, but not yet perfecting the answer. For anyone experimenting or introducing LLMs to their enterprise, that means there is a tremendous need to approach implementation with eyes wide open—to ask the tough questions, to create safe data parameters, and to understand the complete value/risk relationship associated with any data you may be dumping into your LLM.
AI is growing fast. ChatGPT had 100 million active monthly users in January—just 2 months after its launch. Consumers and employees are hungry for the type of support generative AI provides. With that type of input, AI is not able to slow itself down. That’s something only we as humans are capable of doing.
Watch the recent episode of Spotlight on Security with Cohesity below:
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other insights from The Futurum Group:
Cohesity Introduces Turing, Highlighting Data Protection’s Role in AI
IBM Augments its Cyber Resiliency Stack with IBM Storage Defender, in Collaboration with Cohesity
Author Information
Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.
From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.
A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.
An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.