Why Big Data Are so Important

In manufacturing or process plants, every phase in production generates a large amount of data: the analysis of big data is very important for companies, because it helps them discover valuable and non-obvious information, offering new opportunities

Big data are high-volume, high-velocity and/or high-variety information resources (hence the three ‘Vs’ with which they are often identified) requiring cost-effective and innovative forms of information processing, such as to enable a better understanding of the situation, a more efficient decision-making process and the intelligent automation of processes. From an IT standpoint, the term ‘big data’ refers to a set of data which is too large or too complex to be processed by normal computing devices. As such, it is relative to the computing power available on the market. If one looks at the recent history of data, in 1999 we had a total of 1.5 exabytes of data in the world and 1 gigabyte was already considered ‘big data’. According to another definition, big data are data which exceed the processing capacity of conventional database systems. The data are too big, move too fast or do not fit into the structures of database architectures. To obtain value from these data, it is therefore necessary to find an alternative way of processing them.Big data include large volumes of structured, semi-structured and unstructured data, acquired from a variety of heterogeneous sources.
These data are generally assumed to contain hidden valuable information, requiring substantial efforts and resources to discover it.

Thanks to data analysis, the mining industry  can assess the quality of the final output.
Thanks to data analysis, the mining industry can assess the quality of the final output.

The benefits for industry

Why is big data analysis so important in the industrial world? Because it helps companies to exploit their data and use them to identify new opportunities. This, in turn, leads to smarter business decisions, more efficient operations, higher profits and happier customers.
Today, therefore, big data are the front line of a company’s ability to store, process and access all the data it needs.
Big data analysis is currently used for many industrial applications, including product lifecycle management, process redesign, supply chain management and the analysis of production system data. The latter has already received considerable attention because production systems are large sources of raw data which are often difficult to model manually. Thus, numerous applications have emerged to investigate process data for process monitoring, anomaly detection, analysis of causes underlying these anomalies and knowledge extraction.

In a solar panel factory, every cell is inspected to measure its performance.
In a solar panel factory, every cell is inspected to measure its performance.

Storing thousands of process measurements without any risk of loss

Digital production control systems have been a source of big data since the late 1980s, when they began capturing data and storing them. Initially, processing was selective and only the most important process measurements, such as temperatures, flows and pressures, were recorded. As memory capacity increased and storage systems and computer networks became more powerful, thousands of pieces of process information began to be rapidly stored at increasingly high sample rates. Today, in a modern refinery or large chemical plant, it is common to collect tens of thousands of process measurements, valve positions and control system states, at sampling rates of once per second. It is also possible to store all of these data without losing information in their compression.

Specific techniques are now available to identify instrument failures.
Specific techniques are now available to identify instrument failures.

In a short time, big data analytics offers targeted improvements

The current sensor technology (from simple flow meters and process analysers to infrared spectrometers and digital cameras) makes process measurements available at a much faster rate and at a much lower cost than just a few years ago. As a result, the huge amount of data from production processes is regularly available to engineers in real time. This has led to the widespread use of data-driven models. Among these, models based on multivariate statistical techniques have shown their great potential to exploit data (both real-time and historical data) in order to provide information on process behaviour and product quality. Analysing this massive amount of data to find the most relevant information has been a challenge for several decades. Today, new types of tools and techniques are available (gathered under the umbrella of big data analytics) which have proven to be successful in researching, prioritising and solving major problems in process plants. For example, specific techniques are now available to identify instrument failures, find and quantify mechanical problems in control valves, measure the effectiveness of control systems, identify control strategies which are no longer working properly, and measure the performance of advanced process control. Most plant control systems operate far below their potential.
By using big data analytics, significant and targeted improvements can be achieved in a very short time.

Big data analysis leads to smarter business decisions and higher profits.
Big data analysis leads to smarter business decisions and higher profits.

Some application examples from the chemical sector to the mining industry

In the process industry, predictive analysis tools are used to assess the composition of chemicals, minerals and other raw materials to ensure that they meet production requirements. Biopharmaceutical product manufacturers use advanced data analysis to significantly increase production of biological products such as vaccines, without incurring additional capital expenditure. Chemical product manufacturers do this to compare and measure the effect of various production inputs, such as coolant pressure, temperature or carbon dioxide flow on yield, often finding unexpected correlations which affect production. The mining industry can obtain information from fragmented production data by means of multiple processes to find correlations between specific variables (such as oxidation and milling) on the quality of the final output (ore grade). Pharmaceutical manufacturers use data analysis to verify that processes, especially those created in batches, comply with standards which will define the appropriate characteristics.
In the food industry, each product is often measured with a checkweigher to ensure the legal value of the products. In a solar panel factory, every cell is inspected to measure its performance. In all these cases, and many others, product or process measurements will be received every 2-3 seconds, mainly to prevent bad products from reaching the customer or to improve processes. Other examples of the application of big data include predictive analytics and control loop performance monitoring (CLPM) software, as well as simulation using digital twins. A very useful statistical method is multivariate data analysis (MVDA), which offers the possibility of analysing data with more than one variable at a time. Manufacturers can then run advanced statistical models such as what-if calculations, identify where processes are deviating and so on. All of these applications allow manufacturers to discover not-obvious information and relationships, which have the potential for significant gains in reliability and performance.


The adoption of big data, machine learning, robotics, artificial intelligence and the Internet of things (IoT) is having a strong impact on industry and business models. The amount of data is constantly growing, but its value depends largely on the methods of collection and storage and the use of appropriate big data analytics procedures.