Big Data & Data Lake

Advanced Analytics

The term "big data" refers to data that is so large, fast, or complex that it is difficult or impossible to process them with traditional methods. The current definition of big data includes, what is known as, the five "Vs":

  • Volume. Organizations collect data from a variety of sources, including business transactions, smart devices (IoT), industrial equipment, videos, social media and more. In the past, storing it would have been a problem, but a Cheaper storage on platforms like data lakes and Hadoop has alleviated the load.
  • Velocity. With the growth in the internet of things, data flows to businesses at a speed without precedents and must be handled in a timely manner. RFID Tags, sensors, and gauges smartphones are driving the need to deal with these torrents of data in near real time.
  • Variety. The data comes in all kinds of formats, from numerical data structured in databases traditional data to unstructured text documents, emails, videos, audios, data on stock prices and financial transactions.
  • Variability. In addition to increasing data rates and varieties, data flows are unpredictable: they change often and vary greatly. It's a challenge, but companies need to know when something is hot on social media and how to manage daily peak data uploads, seasonal and event-triggered.
  • Veracity. Veracity refers to the quality of the data. Because the data comes from so many different sources, it is difficult to link, join, clean and transform data in systems. The companies they need to connect and map relationships, hierarchies, and multiple data links. Otherwise, your data can get out of control quickly.

I have read and agree to the Privacy Notice