A brief history of data products

There’s some hype about at the moment about data products. No fewer than 19 of the seminars at the recent Big Data London event covered the topic. It got me thinking: is this a new buzz word? or different interpretation of something we have been doing for years?

I couldn’t find much on the origin of the term data products online. The earliest reference, is probably the work by Zhamak Dehghan in her 2019 book: Data Mesh: Delivering Data-Driven Value at Scale. Here she talks about treating data with a product mindset in her second principle of Data Mesh theory . Essentially, if data is a product, it must have customers and we should make sure our customers are happy when we create products for them to use.

Happy data customers

So how do we make our customers happy? Well, there are 8 commonly agreed characteristics of a data product to meet to achieve that goal. Data must be :

  1. Discoverable – you have to be able to find it, preferably in a catalogue.
  2. Addressable – can you find it in the same place each time you need it? It needs an address (e.g. via an API).
  3. Understandable – you have to be able to understand it’s meaning and context.
  4. Trustworthy – you need to have confidence that it’s correct – i.e. you dont have to validate it before use.
  5. Natively Accessible – you need to with your tool of choice (e.g. Excel and/or Power BI).
  6. Interoperable – data is so much more valuable when viewed in the context of other data.
  7. Valuable – a data product should also have some standalone value too.
  8. Secure – you always should secure data, using appropriate encryption and access control rules.

Data ownership

So is this concept ground breaking? Data products with these characteristics is just evidence of a well designed data solution surely? Well here’s the point – a product should always have an owner. More often than not, when I speak to customers, it’s the lack of data ownership that’s one of the main problems to solve. So if there is one benefit that thinking of data as a product brings, it’s that you will establish a clear data ownership.

Data is not “just” an asset

The product owner is central to the success of data, answering semantic questions and making decision that will impact its use. Dehghani also makes the point that data is not merely an asset. If you think of data as a product, the focus shifts to using and measuring its success as opposed to collected and storing data.

It’s here where I think the product analogy is really helpful. It addresses not the 8 usability characteristics above but the people factors. People are always the hardest factors to solve for when working with data.

The future of data products

So where does this leave us? Data is increasingly being thought of as a product and this is for some good reasons. Some of the #BIGDATALDN seminars included:

  • “Building scalable data products to fuel data science outcomes”
  • “Accelerating data product delivery without making a mesh of it”
  • “Applying product management principles to data”

Where I think the biggest gains will be made is by continuing to address the people factors. A great innovation in this space would be to continue the trend in user research and user centred design. If we make sure to focus on data product usability,we’ll be more likely to build the right solutions right first time.

Leave a Reply