Data Preparation – The New Way
For two and a half decades or more, organisations have followed one of the two data warehousing principles of Inmon (CIF, GIF and DW2.0) and Kimball (Star Schema Fact models) for preparation of data. Both methods have intrinsic benefits but leave organisations with some challenges in accessing warehoused data. More recently Amazon S3 has provided data lake capability which similarly requires deep IT knowledge and a somewhat prescriptive data understanding to be of downstream value.
By taking the best of both data warehousing practices a new paradigm is possible however, one fundamental mindshift is required to bring the value of data closer to the surface for business self service, analytics and reporting. This mindshift is the breaking of the human paradigm of clustering data in a format that represents the data source (Transactions remain transactions, readings remain readings and functional records remain functional records) as this is the area that restricts data from being used in a more abstracted way.