8 Mar

Data Mesh 101: Enabling Security and Compliance at Speed and Scale with Federated Data Governance

TA
Tareq Abedrabbo

Highly-regulated enterprises need to ensure that their data is as secure and compliant as a guinea pig in a straitjacket.

If you don't, your CDO will be fined and your products will either get hacked or never make it to production.

But getting this done is a migraine-inducing task in most organisations.

You see, making your data secure and compliant is the remit of data governance.

Traditional data governance is monolithic and consists of two parts.

Firstly, defining the security and compliance policies, which is the remit of the business.

Secondly, deriving a concrete implementation for each policy and executing that on existing datasets, which is the remit of IT.

You don’t have to be a data scientist to see that there is a split between the remit and motivations of the people who define the policies (business boffins who need to keep everything stable and secure) and those who implement the policies (IT people trying to enforce them using big systems and manual processes).

And the poor product developers that need to deliver to deadlines who are caught in between!

The business folk define what they need to happen, but say nothing about how it should happen (because, in truth, they don’t know how). The IT folks put together some high-level blueprints for how to implement, but completely overlook the local context of each team.

And then the product devs are left with a big pdf of data security and compliance requirements with no way of determining what is relevant for them.

Even though it’s not ideal, many businesses have been structured this way for a long time: with a strong separation between IT, the business and developers. It’s the default modus operandi. And then new product teams are just grafted on top, rather than integrated fully.

But it’s business-critical to overcome this separation: in order to use data to create innovative new products or drive data-driven decision-making you need secure, compliant, high-quality and trustworthy data. It’s the cornerstone of business competitiveness in the modern era!

And the problem only gets worse when you start to adopt the latest innovation practices like agile, continuous delivery, the cloud and so on, which only increase the speed and complexity of development.

But what if your goal is to create data-centric digital products at great scale?

Then what do you do?

This is where the data mesh can bridge this gap between compliance and security policies, and their implementation.

Using the Data Mesh to Make Data Secure and Compliant at Speed and Scale

The data mesh is a federated approach to data (and data governance).

The big shift that the data mesh enables is being able to decentralise data, organising it instead along domain-driven lines, with each domain owning its own data that it treats as a product that is consumed by the rest of the organisation.

The ‘federated’ aspect of the data mesh means that governance policies are still defined centrally, but they are implemented by each domain team, which is responsible for making ALL the data that is produced in its domain discoverable to other domains.

Best of all, there are clear boundaries for who is responsible for what when it comes to data governance.

But how does this help security and information security to be adequately implemented?

There are three things the data mesh enables that come together to do so that strengthen and reinforce each other, like the corners of a triangle.

1) Data as a product mindset:

Making data serve the consumer as a priority

2) Computational governance:

Scalable, effective and efficient self-serve infrastructure allows domain teams to codify policies into their infrastructure and data

3) Shifting left security and compliance governance:

With data governance boundaries clearly drawn and baked into processes, product teams can address security and compliance concerns earlier on.

Now let's dive into each of these separately.

Firstly, data-as-a-product.

The core shift in the data mesh is that data is treated as a product and becomes consumer-centric, rather than producer-centric in the traditional model.

Data is prepared in a way that means that it's as easy as possible for the consumer to get what they need from the data. So when the consumers come to use the data for their apps, they can see the classifications, tagging, labelling, metadata etc. and discover their security and compliance implications and obligations upfront (e.g. does this dataset contain PII?).

The first priority for domain teams is to make the data that they are responsible for consumable and discoverable for the rest of the business. The central data governance function defines the attributes and labelling of data, which all domains must implement as appropriate for their own data.

Secondly, the data mesh enables computational governance at scale.

Computational governance means turning policies into code using infrastructure-as-code and automation.

Once the policies have been centrally defined, each business domain is then tasked with turning those into concrete implementations using code.

Data management becomes about product teams codifying the various security and compliance policies—such as enforcing encryption and applying the correct labelling—to manage data at the level of their platforms, infrastructure and even the data products themselves directly.

In my experience, many companies might not have the numbers to do this, but they have a core data team that they can seed into domain teams as an enablement function. This would help them to provide the necessary skills to the rest of the organisation, similar to the way cloud is adopted.

This is theoretically possible without a data mesh, but the problem is that lines of responsibility are very unclear so computational governance cannot be effectively scaled. Under a data mesh paradigm, each domain is responsible for its data from end-to-end, which helps to scale governance efforts.

Thirdly, shifting left security and compliance governance in the software development lifecycle.

We have seen that in a data mesh, firstly, the data is made discoverable with all relevant data governance policies and, secondly, these policies can then be automated within the team’s domain.

This means that when a digital product team is trying to build a new product they can ‘shift left’ the security and compliance governance: instead of using non-secure data and trying to shoehorn in the compliance requirements once the product is done, they can include this aspect before they start serious coding.

With the time and effort saved, there may even be space to embed threat modelling sessions as part of their normal activities as a team activity—rather than something done by distant experts at the end of the development process.

These things become part of the development lifecycle, rather than an afterthought.

This is how things would work with one product. The fact that the data mesh is federated is what allows this process to scale up across the business to integrate more products and domains. I’ve written more about why this is so here: Why Federated Data Governance Is the Secret Sauce of Data Innovation.

In summary: data as a product means that data is consumable with all the relevant security and compliance information. These controls can then be baked into data management and infrastructure using automation. And then developers can shift security and compliance requirements left in the development lifecycle so they are a core consideration and not an afterthought.


Final Thoughts

The process of decentralising, democratising and productising data is a quantum leap in enterprise data architecture that opens the door to massive experimentation and innovation.

It means that enterprises can build AI-enabled apps that are secure by design, which is an absolute must in the highly-regulated context they operate in.

But you need all three corners to make a sturdy data governance triangle! And to keep your data guinea pig safe.

If you want to be competitive, you need to sort your data constraints, and that's where Mesh-AI can help. Identify the areas in your organisation that require the most attention and solve your most crucial data bottlenecks. Download our Data Maturity Assessment & Strategy Accelerator eBook.

Interested in seeing our latest blogs as soon as they get released? Sign up for our newsletter using the form below, and also follow us on LinkedIn.

Latest Stories

See More
This website uses cookies to maximize your experience and help us to understand how we can improve it. By clicking 'Accept', you consent to the use of these cookies. If you would like to manage your cookie settings, you can control this in your internet browser. Find out more in our Privacy Policy