What makes the data mesh such a powerful concept is the principle of federated data governance.
The big shift that the data mesh enables is being able to decentralise data, organising it instead along domain-driven lines, with each domain owning its own data that it treats as a product that is consumed by the rest of the organisation.
The process of decentralising, democratising and productising data is a quantum leap in enterprise data architecture that opens the door to massive experimentation and innovation.
But you can’t just decentralise everything and wait for innovation to occur, there would be chaos.
The secret sauce is using a federated approach to strike a balance between decentralised data sources (that enables innovation at scale) and centralised data governance (that provides the basis for consistency and collaboration across the organisation).
Federated data governance in a data mesh describes a situation in which data governance standards are defined centrally, but local domain teams have the autonomy and resources to execute these standards however is most appropriate for their particular environment.
In this model, autonomous data domain teams and centralised data governance functions collaborate in order to best meet the data needs of the whole organisation.
In this way, teams can “shift left” the implementation of data governance policies and requirements in order to embed them into their data products early in the development lifecycle.
What might this look like in a data mesh?
The data is decentralised, with each domain taking ownership of its own data from end-to-end. This means that each team can scale their own processes without impacting other teams and domains.
Consumers, however, are likely to require data from multiple domains so the different domain data needs to have a very high degree of interoperability so consumers can easily incorporate a variety of datasets from across the business.
So each domain, in order to be part of the mesh, must follow a set of centrally-managed guidelines and standards that determine how their domain data will be categorised, managed, discovered and accessed. This covers things like data contracts, schemas and so on.
This also includes a shared data infrastructure layer that domains can draw on to build their own pipelines from pre-approved templates that ensure security and compliance (and avoid the duplication of each building their own infrastructure from scratch).
This is where the centralised governance comes in, establishing data management practices and processes that ensure that the data provided by each domain is of the highest quality, from a consumer perspective.
There are a few key reasons why data federation is so impactful.
The main benefit is that domains can operate with a high degree of autonomy.
They know their own domain far better than anyone else and are best placed to decide exactly how they should manage their data and how they can best scale.
This level of independence also ensures a high degree of accountability because a single team follows a given data product from production to consumption.
The result is high-quality data products that can be produced in a scalable and resilient fashion by teams that know their own domain intimately and are responsible for end-to-end delivery.
However, the data products that domains produce still need to be usable by the consumer.
There must be a minimum degree of interdependence between domains, which is why having centrally-governed standards is so critical.
Issues that affect all domains need to be subject to a wider authority—perhaps even a team of domain product owners—to ensure that domains are consistent in how they handle and process data.
In a data mesh, data is viewed as a product, so we can draw inspiration from how product development is done in large organisations: ideally, there are certain centrally-governed development guardrails that are baked into architecture and how people work, within which developers are free to innovate as they wish.
The data mesh can be set up similarly, with a team of experts responsible for curating and providing the interoperability ‘guardrails’ within which domains can operate however they see fit.
When domains are functioning in ways that are both independent but interoperable it is possible to govern data with great effectiveness, wherever it is in an organisation.
Domains take care of the local processes and concerns, with a central team ensuring minimum standards for consistency and accessibility.
Data that is effectively governed in this way is a delight for consumers. They can get on with their work knowing that high-quality, highly-discoverable data is on tap and can be plugged into their projects when needed.
Running around different teams trying to find if a particular data set exists or not or whether it can be transformed to meet your needs becomes a thing of the past.
When you have a mesh of independent but interoperable nodes that can be effectively governed and are easy to consume, you have a foundational pattern that can then be scaled massively across the organisation. Not only this, but each node can scale at its own pace, depending on its level of maturity.
The federated data mesh, once set up properly, is highly scalable, which is a massive advantage of this approach.
A federated data mesh model requires a high degree of data maturity in an organisation as it represents a very different and more free-flowing way of allowing domains to interact with each other and with the data itself than with more top-down, centralised approaches.
But the main challenges around federation of data are not technical. The real challenge lies in federating a data mesh culture and mindset: the ways of working and thinking that must underpin this shift in how we handle data.
Your organisation will have to be comfortable with federating not only their technology but their trust.
A mindset shift is required to ensure that each domain has the skills, infrastructure and controls in place to allow it to act autonomously, within the guardrails of inter-domain interoperability.
There are too many domains, however, to manage them all individually (and this would also defeat the purpose of decentralisation!). These domains, then, need to be trusted to get on with the job however they see fit, which some organisations that are used to more centralised control may find unsettling initially.
When each domain is given the trust for their particular piece of the data puzzle at the same time that domain takes on a huge amount of responsibility.
Organisations must make clear that the new ways of working are in place to make life easier for everybody and for the common good of the organisation.
For the data mesh to succeed people—whether they are data producers or consumers—need to be actively contributing to their corner of the data mesh.
Imagine that every domain had complete autonomy to manage their own data as they wish, with absolutely no consideration for cross-domain consistency or co-ordination. There would be carnage.
Similarly, if domains were completely reliant on a centralised data function to manage and make data available then that would become a major bottleneck and innovation would grind to a halt.
The challenge is to find the right balance for your particular organisation between allowing domains to evolve and scale their own data at their own pace while ensuring the data products that result are consistent with other domains.
Critically, this balance will change over time as the organisation matures and so must be constantly adjusted.
Data governance federation is the secret sauce that makes the data mesh possible, making highly-autonomous, local work possible, but within interoperability guardrails that allow for high degrees of collaboration between all the local teams.
This combination of local excellence and inter-domain collaboration creates a massive web of high-quality data products that all corners of the business can draw on to enhance existing services or foster innovation.