In terms that most people are familiar with it refers to a world wide web of data products, where data products and specific datasets can be reached via web addresses in a similar way to how you would retrieve a web page.

Rather than having a central data governance & architecture team providing definitions and data quality rules for data that they have little experience with. The idea is that a data governance team which is closest to the business data be able to provide this role. e.g. a team of HR experts should govern HR data, finance experts should govern finance etc. Note: There is still a need for a central architecture team to define standards, policies & guidelines, however, to ensure interoperability at the very least. This principle is just saying that within that framework, solution architecture teams close to the business should be free to create their own solutions.
It needs to be packaged up with associated documentation and metadata and controls in a similar way to how an application is treated.
The data product can be broken down into 3 main architectural components:-
Code is expected to be containerised so that it can be easily deployed and scaled on cloud platforms. Data and Metadata should persisted. Both should use enterprise infrastructure.
The data product should have defined roles & responsibilities associated with it
The data product should provide sufficient metadata and controls to allow it to be self-served. The capabilities include:-
This is the data mesh concept. Data products should be able to be deployed on any cloud or on-premises platform that is network accessible. Containerisation of data products using technologies such as Docker allows for consistent data product code builds to occur and for the code to be deployable to both on-premise and cloud platforms. Kubernetes allows for cloud-agnostic infrastructure deployement. By defining standard interfaces for the ports/endpoints, data products can be controlled and discovered and documentation, metadata and data obtained.
In layman’s terms its a method of grouping and packaging up 1 or more datasets for a business purpose. It’s similar to a consumer product concept, where milk is an analogy for a dataset and the carton containing the milk that has the metadata (name, barcode, contents info) is analagous to the data product It needs to:-
Despite Zhamak coming from an API/Microservices background, there is no requirement that this is the way that a data product should be implemented.
A dataset just refers to a set of data records and can be:-
For more detail around these concepts, please refer to:-