The 5Vs Of Big Data: From Volume To Value, The Details

A simple and concise model for defining new data generated by the increase in information sources and, more generally, by the evolution of technologies

The 3Vs Of Big Data: What Are They?

Big Data is defined as data that has at least one of the following characteristics:

  1. Volume: every day, in many activities of our daily life, we generate data. Therefore, the volume refers to this huge mass of information that cannot be collected with traditional technologies. This volume of data is constantly growing; international analysts estimate that data production in 2020 will be 44 times greater than in 2009. Precisely for this reason, it is difficult to identify a limit value above which one can speak of Big Data. For now, consider the threshold of more than 50 Terabytes of data volumes growing more than 50% annually.
  2. Speed: Data is born and acquired faster and faster. Just think of the proliferation of devices equipped with sensors capable of collecting data in real-time. The challenge that companies are called to face is the need not only to collect this data but also to analyze it in real-time to make business decisions as quickly as possible.
  3. Variety refers to the different data types available today, coming from many heterogeneous sources. Business transactional and management systems include sensors, social networks, and open data. Both structured and unstructured data are increasingly not only data internal to the organization but also acquired externally.

Why Are We Talking About 5V Today?

In the early 2000s, Big Data was defined with three words: volume, speed, and variety. Over the years, as the term lost its sci-fi aura to become increasingly concrete and applicable in companies, we wondered if there were other characteristics to highlight. Two new Vs. have enriched the model, aimed at defining how this new data should be used:

  1. Truthfulness: among those working in the sector, some used to say, “Bad data is worse than no data.” The data must be reliable and tell the truth. With Big Data, this challenge is even more difficult to face: data management technologies change, the speed with which data is collected changes, and the sources increase. However, the information’s quality and integrity remain essential for creating useful and reliable analyses.
  2. Variability: much more data in different formats and from different contexts. The mutability of their meaning is an aspect to consider when interpreting the data. Even more so if doing it is a user who works in a line of business and not just a data scientist. 

The Sixth “V”: Is The Value

In recent years, Big Data has been defined as the new oil or gold, i.e., an invaluable source of value. That’s exactly how it is. But simply collecting data while exploiting the best technologies available on the market does not guarantee having information and, above all, extracting knowledge. 

Talking about data, information, and knowledge means talking about related but different aspects. Data is a codified representation of an entity, a phenomenon, a transaction, or an event. The information results from a data analysis process; often, it has meaning only for the one who works in the data generation domain. Knowledge is obtained when a person uses the information to make decisions and carry out actions when information is used to be put “into practice.” 

Analytics tools are needed to implement this process and to ensure that Big Data can be transformed into information to be used in business processes by building knowledge to improve performance. In addition to the 5V model, it is necessary to consider a further V: the value. Here it is the fundamental Big Data Analytics methodologies; through their use, a company can extract value or make more informed, timely, and aware decisions from the vast world of Big Data.

