The term "big data" refers to data that is so large, fast, or complex that
it is difficult or impossible to process them with traditional methods. The current definition of big
data includes, what is known as, the five "Vs":
Volume. Organizations collect data from a variety of sources, including
business transactions, smart devices (IoT), industrial equipment, videos,
social media and more. In the past, storing it would have been a problem, but a
Cheaper storage on platforms like data lakes and Hadoop has alleviated the
load.
Velocity. With the growth in the internet of things, data flows to businesses at a speed without
precedents and must be handled in a timely manner. RFID Tags, sensors, and gauges
smartphones are driving the need to deal with these torrents of data in near real time.
Variety. The data comes in all kinds of formats, from numerical data structured in databases
traditional data to unstructured text documents, emails, videos, audios,
data on stock prices and financial transactions.
Variability. In addition to increasing data rates and varieties, data flows are
unpredictable: they change often and vary greatly. It's a challenge, but companies need to know
when something is hot on social media and how to manage daily peak data uploads,
seasonal and event-triggered.
Veracity. Veracity refers to the quality of the data. Because the data comes from so many
different sources, it is difficult to link, join, clean and transform data in systems. The companies
they need to connect and map relationships, hierarchies, and multiple data links. Otherwise, your
data can get out of control quickly.