Cloudera Apache Iceberg

The Six Five team discusses Cloudera Apache Iceberg.

Watch the clip here:

If you are interested in watching the full episode you can check it out here.

Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we do not ask that you treat us as such.


Patrick Moorhead: So Daniel, you wrote a nice article, or one of your analysts did, Ron did, on exactly what’s going on here. So let me first talk about what Iceberg is. So Iceberg is an open source. Basically it’s a table structure that’s open that essentially allows, whether you’re using Spark, Trino, FLaNK, Presto, Hive, Impala to work on these same tables. So think of this as an industry standard table methodology that you can plug all those different tools on top of.

While there have been bigger announcements this year by Cloudera, I like this one because it’s pure to their strategy, which is to take open source technologies, I would say, put enterprise grade quality and stability behind it, and bring it to the biggest enterprises with the most amount of data out there. So it’s true to the strategy and this is all about data, which Cloudera is all about data. And whether you want to manage that data end to end on prem, or in the public cloud, or even managing data that comes through a SAS application, Cloudera’s doing a good job pulling it all together. Kind of one stop shop for data management.

Daniel Newman: Yeah, I think that’s exactly right, Pat. I mean, this is the hot new era of Apache Iceberg, and we’re seeing it talked about quite a bit if you’re in the data space. And Cloudera’s got this ecosystem and openness approach that it’s focused on. And right now, as competition comes at Cloudera and all the legacy and traditional big data warehouse and data lake players, it’s important for Cloudera to continue to innovate, Pat. And to innovate and deliver both openness and interoperability, and I think that’s what they’re doing. I mean, they’re working across the Apache portfolio, and they’re focused on, like I said, openness and the evolving requirements that their customers are seeing. The integrations, the data warehouses are significant. They’re working with Oracle, they’re working with IBM, and Netezza, Teradata, and they basically are functional in the multi-tenant environment that most companies are running in.

So the Cloudera challenge is that, after it went private it’s been a little quieter, a bit out of the center of the news. But again, going back, Pat, to when they made the decision to go private, I believe it was fundamentally decided so the company could reorganize, recalibrate, work on innovation, work at a pace that fit the company’s long term strategy. And I think that they’re doing that. Is that going to happen overnight? No. I mean, it’s going to take some time. It’s going to take some work. But I do think that what CDP is building is going to basically get them over a hump if they continue to push forward with that hybrid mentality, and with that more SAS based approach that they’re trying. They’re trying to make Cloudera more digestible.

They’ve already won that top of the market data world. But what they’re trying to do is say, hey, how do we compete with the hyper scale cloud data offerings? How do we compete with some of the born on cloud data warehouse, data lake solutions. And I think that’s what Cloudera is doing. Again, we aren’t going to have as much evidence as we used to have because we’re not going to get the same reporting metrics as we once did, but I like what they’re doing. I like that they’re aligning with the most important technologies for data tech, data warehouse, data lakes, and I like that they’re open and integrate with all the big data warehouses on a global basis. So good move forward. A lot to watch here. Again, we’re going to have to read between the lines with Cloudera going forward, but progress seems to be moving in the right direction. And keep talking about that here, Pat, as that becomes a bit more evident with, hopefully, the customer wins and other data that you get when you don’t get the earnings data.

Patrick Moorhead: Yeah. And by the way, one thing I’m mistakenly glossed over was that this is the first open data lake house that’s available, if you define open data lake house of having an open table set. Firsts are important in the industry, if nothing else to reinforce your leadership. So Cloudera Apache Iceberg goes GA.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.


Latest Insights:

Nokia Debuts Its Network as Code Platform Aimed at Accelerating CSP Network Programmability and Monetization by Spurring Broader Developer Support of APIs
The Futurum Group’s Ron Westfall examines why the new Nokia Network as Code platform and developer portal can deliver the much-needed catalyst to ensure CSPs play an instrumental role in the B2B digitalization ecosystem.
Med-PaLM 2 Achieves Dramatically Improved Accuracy on US Medical License Exam-Style Questions
Clint Wheelock, Chief Research Officer at The Futurum Group, examines Med-PaLM 2, a multimodal biomedical AI developed by researchers from Google that has achieved a remarkable 86.5% accuracy on US Medical License Exam (USMLE)-style questions.
New Employee Assessment Use Case Might Spur Discussions on Trust and Transparency
Keith Kirkpatrick, Research Director at The Futurum Group, covers Workday’s announcement of generative AI functionality being embedded across finance and HR workflows, and discusses the importance of trust and transparency in an enterprise setting.
Access to the Full Workhuman Platform Will Be Easier as Part of Users’ Workflow
Sherril Hanson, Senior Analyst at The Futurum Group, shares her thoughts on Workhuman’s latest Microsoft Teams integration, which promises easy access to the full Workhuman platform, and explores the latest HWI.