The News: Databricks is making moves to bolster its Data Lakehouse Platform by announcing its intention to acquire Arcion, a Databricks Ventures portfolio company, for more than $100 million. Read the full press release on the Databricks website.
Databricks Acquires Arcion to Bolster AI Ambitions
Analyst Take: As the landscape for AI starts to firm up and move to full-scale deployments at scale, one issue is emerging. Ingesting, curating, and structuring data prior to it being presented to a large language model (LLM) is becoming a key battleground.
In a strategic move that reflects the increasingly complex landscape of data management and AI, Databricks, a frontrunner in the Data Lakehouse Platform sphere, has announced its acquisition of Arcion for over $100 million. This acquisition is not just a mere addition to Databricks’ portfolio but a calculated step to solve one of the most pervasive problems in enterprise data handling—ingestion.
Data Lakehouse Platforms have become the industry standard for orchestrating data and AI workflows, but their utility directly correlates with the quality and volume of data they can access. Data ingestion has often been a bottleneck, plagued by complexity, fragility, and excessive costs. The current enterprise ecosystem is a labyrinth of data silos, with many companies juggling more than 10 disparate systems. According to an MIT Technology Review Insight and Databricks survey, over 80% of the largest companies are managing multiple systems, making data accessibility a formidable challenge.
Databricks points out an issue that has not been largely discussed – enterprise data is scattered and siloed and a big chunk of it is on-premises not in the cloud. Most data platforms have not discussed tying on-premises data to data enterprises willingly have in the cloud. From the release: “Troves of important data sit not only in transactional databases such as Oracle, MySQL, and Postgres, but also in SaaS applications such as Salesforce, SAP, and Workday.” All of this means silos, disconnects, and a real need for normalization.
Arcion allegedly steps in to alleviate these pain points. Specializing in change data capture (CDC) technology, it offers a scalable and highly reliable way of ingesting data from over 20 types of enterprise databases and warehouses. The acquisition will empower Databricks to natively offer a more robust, easy-to-use, and cost-effective data ingestion solution fully integrated with its own platform’s enterprise-grade security and compliance mechanisms.
Data platforms typically use connectors to access and transfer data from various sources for use in analytics and AI/machine learning (ML). Access to data is crucial for building the models powering the expanding AI industry. Arcion has 20 connectors for just such data, using CDC, a technique to transfer only data that has changed, minimizing traffic and time.
This move by Databricks is emblematic of a larger trend in the industry: the consolidation of data and AI capabilities under unified platforms. Ali Ghodsi, cofounder and CEO at Databricks, encapsulated the strategic import of this acquisition by stating that it will allow instantaneous data availability for improved decision-making. Gary Hagmueller, CEO of Arcion, echoed the sentiment, emphasizing that the real-time, large-scale CDC data pipeline technology will extend Databricks’ extract, transform, and load (ETL) capabilities. While we expect Databricks to be bullish, this acquisition aligns with the trends we are seeing emerge in the industry and have observed as focus areas from other vendors we are speaking to.
Looking Ahead
Databricks is leaning into its vision of a unifying data platform that eliminates the pain points of disparate data systems. First investing in Arcion and now acquiring them is a savvy move that helps strengthen the strategic vision. It remains to be seen if the issue of not leveraging on-premises data is data normalization and federation or rather a reluctance to expose highly valuable and strategic data stored on-premises to AI systems. But that said, by making data ingestion simpler and more effective, Databricks and Arcion are jointly laying the groundwork for accelerated data analytics and AI applications, a critical differentiator in today’s fast-paced digital economy.
The big takeaway is that Databricks continues to build on successes by adding more technologies to access and transform data for its Lakehouse Data Platform with the acquisition of Arcion. Completeness and simplicity will drive the selection of data platforms, and Databricks is continuing with advances.
Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.
Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.
Other insights from The Futurum Group:
Databricks Discloses Roadmap for Q3 with Data Platform Capabilities
Databricks’ MosaicML Acquisition, LakehouseIQ Launch, Data + AI Summit Show Gen AI Savvy
Author Information
Randy draws from over 35 years of experience in helping storage companies design and develop products. As a partner at Evaluator Group and now The Futurum Group, he spends much of his time advising IT end-user clients on architectures and acquisitions.
Previously, Randy was Vice President of Storage and Planning at Sun Microsystems. He also developed disk and tape systems for the mainframe attachment at IBM, StorageTek, and two startup companies. Randy also designed disk systems at Fujitsu and Tandem Computers.
Prior to joining The Futurum Group, Randy served as the CTO for ProStor, where he brought products to market addressing a long-term archive for Information Technology and the Healthcare and Media/Entertainment markets.
He has also written numerous industry articles and papers as an educator and presenter, and he is the author of two books: Planning a Storage Strategy and Information Archiving – Economics and Compliance. The latter is the first book of its kind to explore information archiving in depth. Randy regularly teaches classes on Information Management technologies in the U.S. and Europe.
Regarded as a luminary at the intersection of technology and business transformation, Steven Dickens is the Vice President and Practice Leader for Hybrid Cloud, Infrastructure, and Operations at The Futurum Group. With a distinguished track record as a Forbes contributor and a ranking among the Top 10 Analysts by ARInsights, Steven's unique vantage point enables him to chart the nexus between emergent technologies and disruptive innovation, offering unparalleled insights for global enterprises.
Steven's expertise spans a broad spectrum of technologies that drive modern enterprises. Notable among these are open source, hybrid cloud, mission-critical infrastructure, cryptocurrencies, blockchain, and FinTech innovation. His work is foundational in aligning the strategic imperatives of C-suite executives with the practical needs of end users and technology practitioners, serving as a catalyst for optimizing the return on technology investments.
Over the years, Steven has been an integral part of industry behemoths including Broadcom, Hewlett Packard Enterprise (HPE), and IBM. His exceptional ability to pioneer multi-hundred-million-dollar products and to lead global sales teams with revenues in the same echelon has consistently demonstrated his capability for high-impact leadership.
Steven serves as a thought leader in various technology consortiums. He was a founding board member and former Chairperson of the Open Mainframe Project, under the aegis of the Linux Foundation. His role as a Board Advisor continues to shape the advocacy for open source implementations of mainframe technologies.
Mark comes to The Futurum Group from Omdia’s Artificial Intelligence practice, where his focus was on natural language and AI use cases.
Previously, Mark worked as a consultant and analyst providing custom and syndicated qualitative market analysis with an emphasis on mobile technology and identifying trends and opportunities for companies like Syniverse and ABI Research. He has been cited by international media outlets including CNBC, The Wall Street Journal, Bloomberg Businessweek, and CNET. Based in Tampa, Florida, Mark is a veteran market research analyst with 25 years of experience interpreting technology business and holds a Bachelor of Science from the University of Florida.