18 Nov

Back to the Future: Data Mesh Beyond Analytics

TA
Tareq Abedrabbo

The arrival of data mesh paradigm has brought to light the challenges faced by traditional approaches to analytical data use cases.

For example, many organisations struggle with the poor scalability of monolithic data platforms, both technical and organisational. These platforms are often implemented as a technology-first exercise that end up creating multiple bottlenecks on different levels, compromising delivery and data accessibility.

Ultimately, they fail to deliver the anticipated value for many of today’s data-driven organisations.

By contrast, data mesh introduces a decentralised, data-centric approach that addresses these challenges at the core.

Great.

BUT.

The canonical concept of data mesh focuses purely on analytical use cases.

And by focusing only on analytics, we miss the opportunity to tackle some deeper issues that are connected to analytics.

Because here’s the thing: analytical use cases do not exist in isolation.

If we take these use cases out of their broader context (data and software) we could be missing out on solving the real problems that hamstring scalable data utilisation in many other areas of the business.

In this blog, I want to make an argument for broadening the use case horizon of data mesh.

Why do we need to broaden the definition of data mesh?

There are a few key reasons:

  • The problems are deeply rooted: the challenges that hamper analytics in large organisations go deep. They arise much further downstream from the areas they impact. For example, unreliable data sources can impact the quality of predictions of an ML model many layers up the chain.

  • The processes are interconnected: the outputs of analytics processes have a broad spectrum of applications upstream. Some of these use cases are themselves analytical but some are not e.g. a mobile application could use predictions generated by an ML model to update the user interface in real time. From an end user perspective, the quality of the answer (how fast and how accurate it is) depends on the whole chain.

  • The use cases are not easily siloed in practice: Many major use cases do not align nicely with the analytical/operational dichotomy e.g. fraud detection can have an analytical component (pattern detection on a large data set) and a real time one (pattern detection on incoming events in real time). Other use cases can rely on “operational” and “analytical” data sources at the same time e.g. financial reporting.

We absolutely need to avoid turning analytics into its own silo (even if it is a nicer one!).

Instead, we have to consider the broader data and software picture to be able to design and implement far-reaching and durable solutions that will cover both ends of the data spectrum: operational and analytical, and everything in between.

The secret sauce here is the data itself: fundamentally, what all data-centric use cases have in common is the need to access reliable, trustworthy and timely data adapted to their specific requirements. And that is what needs to be brought within the purview of data mesh.

Data Mesh as a Set of Principles to Scale Data Utilisation

Through this lens, data mesh can be seen as a cohesive umbrella and a set of guiding principles for a data-centric transformation:

  • Embrace data decentralisation: data is naturally decentralised, and decentralisation is a potent enabler for autonomy, scale and therefore innovation. It’s a feature, not a bug!

  • Align data ownership with business domains: the ownership of the data, which includes the responsibility to make it available for others, needs to align with meaningful business domains, rather than with historical silos

  • Federate governance: this brings the various strands together to ensure compliance, security and cohesion are achieved across the whole organisation

  • Shared norms, standards and processes: these enable interoperability between data producers and consumers, and more broadly enable teams to collaborate on large-scale data problems

  • Help humans access data: make data a first-class construct, or a product if you wish. Human-centric data discovery and accessibility is the key ingredient of a scalable and consumer-centric data strategy. Those who need the data need to be able to find it and access it with minimal overheads and technicalities

  • Self-serve infrastructure: everything needs to be underpinned by a solid, self-serve approach to infrastructure. You are not going far if infrastructure is a bottleneck.

These principles can be applied to any data source or use case type. None of these are exclusive to analytical data. For example, you should be able to capture a “real time” data stream in a data catalogue, just as you would do the same for a data set sitting in a relational database or a lakehouse.

These principles are not even unique to data. We can see data mesh in its historical context: as a natural step forward to bring to data principles and ways of working that have been proven to be successful in other aspects of IT.

To (over)simplify a bit, we could think of a data mesh as the culmination of a number of successive and overlapping movements sharing the same ethos as data mesh:

  • Agile: emphasises delivering meaningful outcomes (or products) iteratively and empowered teams to operate more autonomously.

  • DevOps: advocates collaboration between traditionally two separate parts of the business: software development and operations. It also enabled key practices to thrive such as CI/CD and automated infrastructure. This further enhanced the ability of teams to operate effectively at scale.

  • Microservices: these naturally build on top of the previous advances and focused on aligning the boundaries of teams and systems, with the business domains. Technically, microservices decomposed applications into distributed components that can be developed and operated autonomously but collaboratively.

Data is the really hard bit and has been the missing piece of the jigsaw puzzle.

It is great to see that the technical and conceptual maturity are catching up to allow us to put data where it deserves: at the centre of any meaningful evolution or transformation.

Conclusion 

As we saw, data use cases fall on a broad spectrum of the analytical/operational dimensions but they all share the necessity to access usable data. The issues affecting these use cases are common and far reaching.

Data mesh is based on a set of powerful principles that will allow data-centric organisations to come to grips with the real problems that are impacting their ability to innovate at scale. Dealing with these problems will yield more meaningful benefits to the full spectrum of data use cases.

These principles have not emerged in a vacuum. They are rather the natural extension of previous movements that dealt with similar challenges in other areas of IT: infrastructure, applications, teams and products. The data itself has taken a sort of a back seat up until now. We now have the tools and the understanding to bring data back to the centre and to move into the future by revisiting the past!




Interested in seeing our latest blogs as soon as they get released? Sign up for our newsletter using the form below, and also follow us on LinkedIn.

Latest Stories

See More
This website uses cookies to maximize your experience and help us to understand how we can improve it. By clicking 'Accept', you consent to the use of these cookies. If you would like to manage your cookie settings, you can control this in your internet browser. Find out more in our Privacy Policy