From a data consumer’s perspective, data products:-
When naming & versioning data products, we need to make the distinction between the data product, the data product type (e.g. REST API) and the dataset
Standardised data product names need to be determined. These should relate to either a business domain or business subdomain rather than the data sets that the data product provides or the systems that provide the data
Data products need to be grouped by business domain/sub-domain appropriately.
For example, a data product that belongs to the UK sales department could be grouped (using metadata) under:-
organisation->sales or organisation->uk
You should resist the temptation to:-
<organisation>/uk/sales or <organisation>/sales/uk but should just call it uk or sales and provide the grouping as part of the metadata.The general principle is that if a data product is put into production, strong change control should be enforced to ensure that data products are only changed if strictly necessary.
Versioning should follow the concept of breaking and non-breaking changes, where a breaking change will impact existing data consumers and a non-breaking change will not impact the data consumer
Breaking changes should result in a major version number change i.e. 1.x to 2.x
Non-breaking changes should result in a minor version number change e.g. 1.1 to 1.2
Data products should be assigned uniform resource identifiers within the metadata with the following path format:-
https://data.<organisation url>/<business area>/<data product>/<data product version>
So in our PoC example we have a single data product with just 1 version :-
https://data-acme/data_architecture/data_product_poc/0.1
<Business area> can be hierarchical so you could have:-
https://data.acme.com/IT/data_architecture/data_product/0.1
https://data.acme/Europe/UK/sales/data_product/0.1
https://data.acme/North America/USA/sales/data_product/0.1