The little spoken side of Big Data

  • BLOG
  • July 31st, 2019

These days almost every other article on analytics seems to be talking about Big Data.

Most of this hype about Big Data is justified – partly because it is the next big thing, but also because we don’t understand it. When we think about Big Data, we just think “numbers” big. We think terabytes and petabytes. We think huge computing power. We think parallel processing. We think big words like Hadoop and Mahout. In reality, these are easier concepts to perceive than what is not being talked about as much.

A lot of the recent increases in data volumes can be attributed to what is being written in the social media, what is being captured by store video cameras and user comments about products on product pages and blogs. Making sense of this kind of data doesn’t just need more memory and more computing power. It needs advances in our understanding of unstructured data. Most of this data tend to be subjective and context dependent. A comment like “Unbelievable!!” can have very different meanings based on the context in which it was said. A video of a bunch of people walking around in a store by itself may mean nothing but when combined with the fact that there was a promotion being run in the bakery which is on the north east side of the store may lead to something insightful.

This kind of contextualization of data requires developing sophisticated algorithms and logic and this side of Big Data seems to be not getting enough attention. It requires a combination of business, IT, math and behavioral sciences to define and systematically capture context. If your Big Data strategy only covers the size of your database and the number of nodes in your network cluster and their computing power, think again!!