This isn’t a predictions list.
Because humans are really bad at predictions. Unless you are Arthur C. Clarke, I’ve found it’s best to leave them alone!
Instead, in this blog, I want to highlight some of the top data trends that I know are happening right now that will have an immediate impact on how enterprises do data in 2022.
These are all movements that are already well underway and having a profound impact on how enterprises think and use data in 2022.
Let’s jump in!
This is a broad trend, but an important one.
How many people do you think have spent time looking at Covid-19 data over the past 18 months? Or perhaps discussed the ‘R’ number, the reproduction rate of the virus, over the dinner table?
Probably most!
This is part of a growing trend in which it is becoming increasingly important to understand data to live in the modern world.
Data is no longer just for nerds, it’s critical for everyone to understand topics that are important to them: from Covid-19 to climate change to the number of Green Triangles you can expect in a box of Quality Street.
And the enterprise is not immune to this proliferation of data. As the understanding of the importance of data grows, the horizon of possibility broadens and resistance to data-centric ways of doing business shrinks.
This changes how enterprises need to think about data. It’s no longer a distant niggle that your CDO needs to deal with, it’s now the joint responsibility (and opportunity!) of everyone across the business to understand data.
Data is a central part of life and, by extension, must be a central part of enterprise strategy!
Numerous machine learning mini-trends are starting to converge. The effect is that machine learning is starting to become a viable mainstream technology option for many organisations.
For example, we interact with many more machine learning models than we suspect. An example would be Netflix and Amazon’s sophisticated recommendation engines or the way mortgages and loans are increasingly approved using machine learning algorithms.
Here are a few of these key mini-trends are gaining massive popularity in the enterprise data scene:
- MLOps: this is about productionising ML—standardising and streamlining the continuous delivery of high-performing ML models into production
- AutoML: this is the process of automating time-consuming, iterative tasks, in particular data acquisition, in the development of ML models
- TinyML / Small Data: this is the capability of running ML on embedded devices such as Alexa without needing to contact a server, increasing performance and privacy
- Low-code ML: user-friendly interfaces that allow non-experts to utilise machine learning technology
Together, these mini-trends are making it much easier for enterprises to create high-performing, highly-automated ML models. In turn, this pushes machine learning from a niche concern into a technology that enterprises can deploy across their entire business.
An important caveat is that these are all new trends, with each addressing different aspects of the machine learning lifecycle. They have not all been proven to work at enterprise scale. We will have to wait and see which ones can deliver real-life value at scale and which ones prove to be problematic.
Over the last year or two I have seen a deepening of understanding among our enterprise clients around the benefits of a federated data architecture.
And we are seeing more and more success stories emerge, driving ever more confidence in the data mesh approach.
In particular, I see many companies adopting data mesh principles without necessarily calling it a ‘data mesh’. This is great validation that the principles are solid and that people are not just jumping on a marketing term.
As this process unfolds, the data mesh is moving from theory to practice and is growing in maturity as people adapt it to their needs.
For example, the design and implementation of data products is becoming increasingly sophisticated as more and more people start implementing them in real life. Also, supporting data mesh capabilities, such as data discovery, are being broadly adopted and improved by many organisations.
We will be seeing a lot more federated data architectures—whether they are called data meshes or not—over the next 12 months!
I’ve noticed that there are some very powerful, but under-utilised use cases for analytics out there: streaming and graph analytics, in particular.
Streaming analytics is the processing and analysing of data records continuously in “real-time”, rather than in periodic batches.
This allows us to generate complex, up-to-date insights on the fly, which can then be acted on immediately and automatically.
Graph analytics refers to the process of analysing data in a graph format, using data points as nodes in a network of entities, such as customers, products, devices etc.
This is the best tool we have for representing and querying knowledge that emerges from connected data sources, generating not just a limited set of insights but a rich and interactive web of endless potential insights.
Despite their powerful use cases, these modalities were generally considered too complex or niche to adopt at scale.
That’s all changing, however. The technology continues to get more intuitive and accessible, helping to lower the barrier to entry. And as the barrier to entry lowers, more and more people are starting to see the substantial business value that results, leading to more up-take and more investment in technological maturation.
Expect to see a lot of graph and streaming fun in 2022!
Traditionally, businesses have created two siloes around their data work: data science (analytical data work) and data engineering (operational data work).
As the complexity of the data problems that businesses are trying to solve increases, this distinction is becoming increasingly blurred.
Complex use cases, such as fraud detection, for example, are a mix or both. It sometimes requires analysing large datasets offline in order to detect suspicious patterns (analytical work) but also requires real-time detection of simpler fraud techniques that must be countered immediately (operational work).
The trend is towards looking at data problems in a holistic, end-to-end fashion, rather than applying siloed thinking to them.
This has the advantage of generating results that are more outcome-oriented, also. After all, end users don’t care about the artificial distinction between “analytical” and “operational” approaches, they care about getting their problem fixed!
Despite the fact that most company’s data is naturally decentralised, the traditional approach has been to move it all to one location, which could work at a certain scale, but is difficult and has its drawbacks.
But advances in cloud platforms are now allowing for viable hybrid systems—connecting cloud infrastructure with on-premises data and compute resources—that can be accessed via a central control plane.
This means you can have data sources everywhere—on-prem, in the cloud, even with third parties—but still use it as one cohesive system.
In this context, distributed architectures like the data mesh can be more easily applied as the data is no longer viewed through an on-prem/off-prem siloed mindset.
This trend is going to be huge, in particular, for massive data-centric enterprises such as financial or healthcare organisations. They have mountains of incredibly valuable decentralised data but have not, due to the limitations of their data platforms, had an effective way of getting a handle on it.
High-quality data sharing is becoming a necessity in many industries—energy, healthcare, life sciences, genomics—not only within the organisation, but between them.
In the energy industry, for example, the physical energy grid (and all the data required to run it) is being linked up across the world. So, to be competitive energy companies are forced to expand their data capabilities accordingly.
At the same time, the foundation of modern data architectures is precisely the high-quality data sharing I am talking about, which is the life blood of the architecture.
The decentralised but federated data mesh, for example, is designed to address the problems of data sharing at massive scale—and does so very well!
Enterprises in these industries are finding that in order to remain competitive in the market, they need all the things that modern data architectures give them: interoperability, federated data governance, efficient data change management, data discovery etc.
This trend is forcing the hand of enterprises everywhere, driving them towards the adoption of modern data architectures as a prerequisite for competitiveness.
Whether they like it or not!
These trends are well underway in the field, with enterprises having to respond in order to remain competitive.
Some are subtle, but will nevertheless shape how enterprises understand and work with data for the foreseeable future.
We have helped many of the world’s largest enterprise organisations to revamp their data approach in response to competitive threats.
For more insights, check out our ebook - The Data Mesh: How We Will Finally Unleash the Promised Value of Data in the Enterprise.
Interested in seeing our latest blogs as soon as they get released? Sign up for our newsletter using the form below, and also follow us on LinkedIn.