The Top Data
Trends for 2026
Industry experts share their predictions for the year ahead
Download the Full ReportThe AI Correction:
From Hype to Reality
AI is overshadowing everything right now, and agentic models and their impact are dominating the conversation. There’s definitely a bubble there, and while it’s hard to say if it’s going to pop in 2026, there’s sure to be a correction. People are starting to realize that expectations were set too high. Think about Gartner’s Hype Cycle for Artificial Intelligence report—this will be the “fall from hype.”
What I’m really seeing is the rapid commoditization of AI. There are all these AI companies growing rapidly in the application layers. As soon as one of them catches fire, they have twelve new competitors within months, all doing the exact same things. It’s basically a market on hyperspeed—one player emerges, quickly becomes the number one player, and then tons of other players emerge based on VC funding. Now they’re all running in the same direction at the same time, which sounds a lot like what we first saw with the Modern Data Stack. I think there’s a parallel there with how the AI market will unfold.
The question is whether there’s a new stack emerging for AI. Everybody wants to believe that, but I think it will be made up of existing companies rather than new ones. I don’t see any new data companies emerging that are leveraging AI in unique ways that the incumbents currently don’t.
I would not be surprised if, given how ‘hot and frothy’ the AI data space has become, there’s a bit of a shakeout about the valuations of AI companies in 2026. While bubbles can be fantastic and exciting, a shakeout isn’t necessarily a bad sign. Obviously, if you’re concentrated in that space, it will hurt for a little bit. But it’s the natural cycle of the market: things get hyper-inflated, then they ease up a bit. That means we’re moving from one phase to the next.
Even though we’re seeing our clients truly gain efficiency and productivity thanks to AI, we may still go through that public market cycle with AI software and foundational model companies—because at some point, the tea kettle has to blow off some steam.
As for the actual implementation of AI on the client side, we’re moving into a new phase I think is really exciting: budgets and spending will increase, our clients will have better earnings calls, and their stock prices will go up. It feels like 2026 could be the year when we start to see this play out, and we shouldn’t be afraid to ride through that. If you’re willing to be part of the “bubble,” you also have to be ready for the “post-bubble.”
New AI projects in 2026 will require clear business cases and ROI before moving forward. Proof-of-concept failures will lead to skepticism, so it will be critical to show value at a low cost before committing to larger initiatives. Quality data will also be at the forefront for successful AI projects because of all of this year’s AI failures and hype. People are starting to see the lack of results despite all the money being thrown at AI projects.
I think this will change in 2026 as companies tighten their budgets. Personally, I’d like to see less AI hype in the year ahead. Yes, AI is the future, but it isn’t a solution for all business problems.
I hope 2026 will be the year of effective, intentional AI. I’m seeing a dangerous trend where users are abdicating their personal responsibility to produce critical thought in favor of the convenience of a prompt. We’ve become incredibly eager to delegate deep thinking to large language models—and it’s starting to show. People notice the lack of real effort to contribute to the output of a generative AI chatbot before something gets shared. As the hype fades and we settle into our new normal, I hope we defer less critical thinking to AI and start leveraging it more as a tool to enhance the creativity and brilliance that we have in abundance in our industry.
AI-Ready Data: Quality and Governance Become Critical
PanTexAs Deterrence
The most significant data trend of 2026 will be the shift to AI readiness as a core architectural mandate. This isn’t just about launching a few AI pilots; it’s the realization that data programs must pivot entirely to support the executive focus on AI. We’ll see organizations aggressively re-architecting to ensure data is immediately governed, standardized, and available in real time—because, simply put, AI models fail without it. The relevance of every data initiative hinges on building and demonstrating AI readiness, which has become the ultimate measure of data quality and speed.
In 2026, we expect to see an acceleration in AI-ready data and platforms. Throughout 2025, many AI projects were hindered by the lack of available, usable data and unready platforms. Organizations have realized that AI’s ROI has been limited by the need to ensure their data is both available and well organized. To fully maximize AI’s impact within their operations, companies must have data curated to a usable state, along with platforms that can operate at enterprise scale.
The message is clear: the foundation of successful AI is having quality data and a strong enterprise platform. If you’re not prioritizing those right now, you’re at risk of falling significantly behind and being disrupted—but by focusing on these foundational pieces, you can position yourself for success and safeguard against disruption.
One big data trend we’ll see in 2026 is organizations finally starting to realize that getting data “AI-ready” is much harder than imagined. Folks think that AI and agentic AI are going to automatically clean up their data and immediately transform it into useful information and insights. But as organizations get more serious about rolling out AI-assisted programs and processes into production, they are finding out just how bad the data they want to use really is.
Back in the Hadoop era, many believed data modeling was no longer needed (e.g., schema-on-read) and data warehousing was dead. Now in 2026, organizations need to wake up to the reality that “garbage in, garbage out” is still true, that much of their data is not ready for use, and that it presents a major risk to the success of their AI initiatives. That means organizations will have to invest heavily in revamping their data ecosystem to include better data modeling, QA, and governance, and follow the basic data management practices that have been around for decades.
While some started to realize this in 2025, it will accelerate in 2026—there is no free lunch.
The Rise of Agentic AI and Autonomous Systems
Manual data migrations and clunky ETL pipelines are the stuff of nightmares, slowing down modernization and racking up costs. Agentic AI is here to save the day, automating everything from schema mapping to pipeline orchestration. By 2026, over 60% of enterprises will use these smart, collaborative AI agents to modernize legacy systems, slashing errors by 70% and halving migration timelines. With data sources growing more complex, this trend is critical for achieving zero-downtime transitions to cloud-native lakehouses. Ignore it, and you’re stuck in the slow lane.
The winning approach is to deploy AI-orchestrated pipelines using AI-powered DataOps to automate migrations with real-time monitoring.
AI agents are getting better—they can now perform autonomously for up to an hour without human intervention. If you give it a task like “go debug my program,” it will spend an hour figuring things out. That’s a drastic improvement from where we started, but the accuracy is still only 50%. Is that something a business can live with? I don’t think so. You can’t spend an hour on an important task when the answer is only 50% likely to be correct.
In 2026, we’ll start to see success with simple agentic AI use cases where the stakes are low, such as telling it, “Go set up an appointment with my barber.” If it picks the wrong time, I can easily reschedule. These are the types of use cases where you don’t lose anything. But most enterprise decisions require a high level of accuracy, and we’re not there yet.
Technology adoption always follows this pattern: it happens in the application space first with simple, narrow tasks where the context is small enough to control and test. The more complex analytics use cases will come later, once AIs can handle the massive amount of context those require.
I foresee developments in the next phase of “Automation of Automation” in 2026, where data sets will become increasingly self-discoverable and self-analyzing. That said, I think 2026 will mark an even greater data divide for many companies. While some companies are achieving amazing results with their AI initiatives, others are still not seeing the results they expected. This is often related to data preparation, data quality, and overall data strategy. But let’s not forget that, even in the era of generative AI, true change takes time—it requires companies and individuals to think differently and not follow the standard practices that people have been trained to follow for years.
To succeed in the future, people will need to learn how to think agentically, a huge mindset shift that hasn’t been addressed enough, especially when it comes to data preparation, cataloging, and near real-time quality assessment.
That near real-time quality assessment is also fundamental to building an AI-ready data foundation. Using near real-time data for AI analysis means finding any major quality issues immediately, before the data is fed into AI agents and operations. “Good enough” data is no longer acceptable as input to AI automations.
As agentic AI becomes more common, personal agents won’t just be for actors or athletes anymore—everyone will use them. We’re already comfortable with AI assistants, and over time many of these assistants will become more agentic, able to function autonomously and work collaboratively on complex tasks that we assign to them.
A recent Capgemini study on harnessing the value of gen AI showed a steady planned increase in the use of agents by large enterprises: 10% currently use them in some form, while more than 50% plan to use them in the next year, jumping to 82% in the next three years. There are high expectations for this emerging technology.
The Semantic Layer as Essential AI Infrastructure
The AI trend that’s going to have the biggest impact on business isn’t the next big model—it’s the rise of the semantic layer as the new AI layer.
For years, the semantic layer has been treated like a reporting convenience, something to keep KPIs consistent across dashboards. But now it’s becoming the thing that gives data real meaning. As companies move deeper into AI, that shared business language is what lets analytics tools, search interfaces, and AI agents all “speak the same language.” That’s how you get answers that aren’t just accurate, but explainable.
We’re already seeing it happen. SAP’s Business Data Cloud can share business context directly into Snowflake, and platforms like Coalesce can refine and extend that meaning before it’s used by BI tools or AI systems. Suddenly, the same definition of “revenue” or “customer” carries through every layer—from ERP to the semantic layer to the AI agent that answers your question.
That’s when the medallion framework stops being about data movement and starts being about understanding. The hybrid architecture evolves into a hierarchy of understanding.
It’s already begun, but over the next year, AI will push the semantic layer into the mainstream. For years, the data community debated whether defining consistent metrics and structure was worth the effort. Now, AI has made the answer obvious. You can’t get reliable results by just pointing an LLM at your data. AI needs to understand your business context for it to be useful. That means your data needs to reflect how your organization actually talks and operates.
With traditional BI, messy or inconsistent models caused confusion, but there were buffers. Analysts could intervene. Dashboards limited scope. A bad result might be frustrating, but rarely damaging. In an AI workflow, you need to find new ways to build in guardrails. It’s not just about building dashboards anymore; AI is querying your data in natural language and its results impact decisions. That only works when it’s grounded in a consistent, shared understanding of your business.
The semantic layer is no longer a nice-to-have. It’s the foundation that makes AI work.
AI Will Elevate, Not Replace Data Professionals
I don’t think AI is replacing data engineers—I think the job is changing in the same way it changed when we moved from writing MapReduce jobs to using SQL, or from managing on-prem servers to using cloud warehouses. Data engineers will move to higher levels of abstraction. The tedious parts get automated, which frees you up to focus on the parts that actually require human judgment and collaboration.
Automation of automation doesn’t reduce the need for data engineers—it actually increases the need for their expertise. When an agent can handle the mechanical work of a migration in weeks instead of months, you need your engineers making higher-level decisions more frequently: reviewing the agent’s approach, validating business logic, setting quality standards, and working with stakeholders to understand what success actually looks like.
For many, the immediate skill shift is learning to work effectively with AI—and frankly, most still need to get better at it. But once you master that, the shift is getting comfortable spending time with the people who use your data. When you’re not buried in tactical work, you can actually sit with the product team and understand what they’re trying to measure. You can talk to executives about what questions they’re asking of their dashboards and why the current data model makes those questions hard to answer. You can partner with analysts to understand which data quality issues actually impact decisions versus which ones are just noise.
Honestly, the data engineers who are most excited about what we’re building at Datafold are the ones who never wanted to spend six months manually rewriting stored procedures anyway. They want to solve interesting problems and work on things that move the business forward. Automation just gets the grunt work out of the way faster.
I think 2026 will be the year when we finally see widespread implementation of AI in the workplace. Both 2023 and 2024 were the hype years about AI’s potential, but few organizations had actually adopted it. By 2025, most mainstream SaaS platforms, including Coalesce, started making AI “copilots” available. Nowadays, if there’s not an AI feature built into your product, it immediately feels old and outdated. But in the coming year, I believe the next iteration of AI will arrive in the workplace. Organizations will be required to adopt AI governance policies as more employees turn to AI for everyday tasks ranging from coding, process automation, data analysis, and various mundane tasks. That famous line from Star Trek now applies to AI: “We are the Borg. You will be assimilated. Resistance is futile!”
For companies striving to build an AI-ready data foundation, the first step is acceptance of AI. If organizations continue to resist AI due to fear, they will fall behind their peers. I’m not saying organizations should blindly allow AI everywhere, but they need to accept that their employees are eager to use AI to become more productive, and pivot to a strategy that’s open to it. Collaborate with data and AI leaders and security experts to develop a comprehensive AI governance and implementation plan. Establish clear guidelines for permissible use of AI across the organization.
The biggest misconception I want to see corrected in 2026: that AI will replace data engineers. Everyone’s talking about it, but I think the opposite is true—AI is going to make data engineering more essential, not less.
Here’s why: AI will absolutely automate tasks. But automation doesn’t eliminate the need for humans. It elevates what humans do. When our Troubleshooting Agent explores thousands of hypotheses to identify why an issue occurred, you’d think that mitigates the need for humans in the loop. On the contrary, data + AI folks are still crucial to actually investigate, triage, and execute on what the agent presents.
As AI takes over the tedious, time-intensive workflows, humans will spend more time on higher-order problems: reliability, governance, strategy, and designing systems that both humans and AI can trust. The skills that matter (abstract thinking, business understanding, contextual creation) become more valuable, not less. My advice to anyone in the field: learn how to work with AI, not compete with it. Understand how to validate outputs and design trustworthy systems. That’s the future.
Natural Language Will Become the New Data Language
Imagine a world where anyone in your company can ask data questions and get instant, spot-on answers—no data scientist required. That’s where conversational AI is taking us. By 2026, 70% of businesses will use NLP-driven platforms to let employees query lakehouses and vector databases with plain English, cutting analytics time by up to 60%. This shift is driven by the hunger for faster decisions in cutthroat markets—every second counts, and conversational data delivers. It’s not just cool tech; it’s a revolution in how enterprises unlock insights.
Success here requires rolling out semantic layers with AI models that turn natural language into real-time, context-rich insights, bridging SaaS and on-prem data seamlessly.
The most exciting thing I see coming around the corner is the ability to talk to data tools and build pipelines in a conversational manner—just describe what I want it to build without having to drag and drop anything. I talk to ChatGPT every day, and it’s amazing how many things it can do. I want to build pipelines the same way.
This is where everything is headed. We already know that at least 50% of this is possible. Now the question is, can we get to 100%? I’m excited to reach the next level and figure out how we get to 80% or higher.
In analytics, this approach is different from building an app. Data problems are unique: you’re managing vast volumes of data, everyone wants something different from it, and you have to move fast. You don’t always need 100% accuracy; you just need to know enough to make a decision. If I’m a business leader, getting an answer that’s 80–90% accurate right away, versus waiting, is incredibly valuable because I can act on it immediately.
For example, I could say, “Go to my last two years of sales data, use these two sources, join whatever you need to join, and forecast this for the next 36 months with these assumptions.” If it gives me a forecasting number that’s 80% accurate, that’s valuable. I’m building something and using it right away, without needing to be a data engineer. And if I want to dig deeper, I can look under the hood, see the lineage, and understand exactly what it did.
Looking ahead, what excites me most is the rise of conversational analytics. It has a real opportunity to change how we’ve thought about reports, visualizations, and dashboards over the past decade or more. We’re finding opportunities where analysts no longer have to anticipate the five or ten right questions, or spend countless hours in requirements meetings to ensure business data is adequately translated before an engineer spends hours building a dashboard with a few filters.
As we learn more about these semantic views and how to weave them into unstructured data, we’re entering a moment when more comprehensive insights can be found by simply talking to the data directly from the platform itself.