So, what is it that makes Data Fabric different?
A good strategy as a data professional, it could be said, is to pursue the activities that people find the most boring, and try to make them less so, or to take them off their plate altogether. When a young entrepreneur sets out building, let’s say, a restaurant business, I’m pretty certain they’re not thinking ‘wow, imagine all of that data I’ll get to manage, I can’t wait until we define a data governance framework’. Of course, they aren’t, and neither should they be, but the best leaders recognise the importance of such things and understand their potential to achieve competitive advantage.
So… when businesses start thinking about the potential of Artificial Intelligence and Machine Learning, we can all be fairly sure that the first thought that jumps into their heads is not ‘wow, imagine what exciting things I could achieve if I applied machine learning techniques to my Metadata.’
But this is exactly how Data Fabric promises to deliver value: by learning from the way people interact with data, we create an ecosystem that is adaptive and self-healing, allowing us to focus on the data products and pipelines that will deliver the most value to the business rather than constantly attending to the ‘plumbing’.
So what do we mean when we say Data Fabric? Well, there are multiple definitions out there, but, for me, there are 3 key elements:
- It is a logical architecture, not a software product
- It is an integrated data layer from which data can be discovered and utilised by people and applications alike
- It uses advanced analytic and AI/ML techniques to constantly analyse the available metadata and make recommendations on how it should evolve
Let’s explore this last element in a bit more depth... Data Fabric places at its core a number of concepts that even the most sceptical of data professionals would struggle to contest:
- The data needs of an organisation change more frequently than traditional approaches to data solution design can keep up with.
- People articulate their data requirements far more effectively through the things they do than through the things they say.
- Organisations constantly waste time and effort integrating and preparing data that is barely used, often at the expense of the data that is actually needed.
- However hard we try to centralise data into warehouses, lakes etc. we always end up with data in multiple places, sometimes spread across multiple cloud and on-premise solutions.
By performing continuous analysis of the actual interactions we and our business systems have with our data, the Fabric is, at least in theory, able to predict and recommend, or even automate, modifications to our data pipelines that deliver what we need, rather than what we say we want.
It all sounds a bit too good to be true, right? Well, in some ways, it probably is. At least right now. Even analyst organisations such as Gartner highlight that currently available technology solutions aren’t yet at a level of maturity to deliver against the full promise of the Data Fabric. But that doesn’t mean we shouldn’t take the first steps on our journey.