Every business has data. Some organisations can spin gold from their data, but others merely collect it in the hopes of getting value someday. While there is no easy path to maximising the value of your data, there is at least a clear journey. Through this four-part series, we'll look at different types of analytics starting in this post with the business past.
The first stage of the data journey is descriptive analytics. This begins when a business starts to ask questions about what has happened before. These are often questions that have summary answers such as "How many widgets did we sell last month?" and "What was the total sales in London the last time the month started on a Tuesday?"
Asking the questions is only the beginning; getting the answers can be surprisingly challenging. Unless you happen to be that unicorn of a business that manages to operate solely from a single data source, you will have several different sources of data that need to be combined. Even if you are that unicorn, existing systems are likely transactional and geared up for working with exact records one-by-one and not swathes of records at a time.
We solve this problem by running batch-based Extract-Transform-Load (ETL) jobs. These extract the data from the various sources, transforming to match a pre-existing model before loading into a Data Warehouse. There is now a plethora of tools to help you build these both through visual editors or even with code.
Let me give an example of the value of knowing what has happened before. A parcel courier firm I worked at had a descriptive analytics system that allowed us to identify "ugly" parcels. These were parcels that were oddly shaped, too heavy, too long - essentially parcels that did not fit on a sortation tray. The ugly parcel required someone to manually process it, bypassing all the automation that exists for efficiency.
While it seems like a funny statistic, the value appears when an account comes up for renewal. An account manager can consider this statistic when working out the fees and seriously increase the costs to compensate for the impact of manual sortation if the majority of the parcels are ugly.
Now that we have an idea of what has happened before, we can start to look to the future. Before we do, we need to deal with the main drawback of batch-based processing, the fact that our analysis is always out of date. To solve this, we'll look at streaming analytics in our next post.