Skip to main content

4 posts tagged with "datapreparation"

View All Tags

CryspIQ - my first thoughts

· 3 min read
Phlippie Smit
CryspIQ Technical Owner

"We're building something amazing"

When I spoke to Vaughan a couple of years ago he told me that Dan and he was building something amazing. He spoke about a data model that can take in any type of data and report on it.

Having worked in large corporate and government data warehouse environments, I was naturally sceptical as you may be. I also fiddled with universal data models when I was studying at university, so the project and product had my attention.

I really wanted to see it in action and after spending some time on the East Coast in another BI role and trying my hand as an IT Manager, I joined the CryspIQ team.

Big Data

· 7 min read
Vaughan Nothnagel
CryspIQ Inventor

A pragmatic business problem resolution perspective

The hype cycle of big data has brought a number of single stream suggestions to bear in resolving the world’s insatiable appetite for information. There are 3 primary solution focus areas that show intent in solving the big data problem, each with their own specific benefit: -

  • Technology – Database fragmentation, Multi thread parallelism, Logical and Physical Partitioning;
  • Infrastructure – Processor Componentisation, Tiered Storage Provisioning, Hardwired Distributed Storage; and
  • Application – Data De-Normalisation & Aggregation, Read-ahead logic, Pre Process aggregation, Performance re-engineering.

The truth be told, each specific solution in isolation provides of itself an improvement opportunity and each will bring varying degrees of success to an organisations ability to handle the big data problem if, and only if, the right questions are asked. Only through this rigorous analysis and functional de-composition can one select the right method(s) for resolution to provide long term resolution as opposed to short term ‘disguising’ of the issue.

Data Preparation – The New Way

· 6 min read
Vaughan Nothnagel
CryspIQ Inventor

For two and a half decades or more, organisations have followed one of the two data warehousing principles of Inmon (CIF, GIF and DW2.0) and Kimball (Star Schema Fact models) for preparation of data. Both methods have intrinsic benefits but leave organisations with some challenges in accessing warehoused data. More recently Amazon S3 has provided data lake capability which similarly requires deep IT knowledge and a somewhat prescriptive data understanding to be of downstream value.

By taking the best of both data warehousing practices a new paradigm is possible however, one fundamental mindshift is required to bring the value of data closer to the surface for business self service, analytics and reporting. This mindshift is the breaking of the human paradigm of clustering data in a format that represents the data source (Transactions remain transactions, readings remain readings and functional records remain functional records) as this is the area that restricts data from being used in a more abstracted way.

Data Warehousing – Time for a New Paradigm

· 7 min read
Vaughan Nothnagel
CryspIQ Inventor

For more than 25 years now, businesses have followed a pattern of mapping business transactions in their own context into a data warehouse. Realistically, this practice, driven by the ideals and methodologies of either Bill Inmon or Ralph Kimball, has to-date provided a great ability to re-format transactional data into a model that enables rapid retrieval and consistent enterprise reporting platforms. I believe that through this analytics evolution cycle, we have become a victim of our own design in that the value delivered from a data warehouse built in this manner is, by and large, not conducive to encouraging new questions to be asked of your business data. This stifled pattern of use leads us to, typically, answer questions from complex data warehouses for answers that we could just as well have built a static report from the operational system.

What’s the point you might ask, well, simplistically, if the business transaction from one area in your organisation looks like and apple and another looks like an orange, the reality is that it is difficult to compare or analyse the two together. There are a number of anomalies of which timeousness of information, granularity and the absence of common associative information are but a few that restrict us in doing this effectively. So, if we insist on making each warehouse entry look like the apple or orange from where it was sourced, the warehouse is, by design, failing to deliver much if any value other than migration of the reporting platform to a different system.