DATA MINING
The new oil of the global economy

 

 

Analog life dominates the streets of Bengaluru in a wild stream of buses, taxis, two-wheelers, and cows. Add to that a lively mix of people: some in modern business attire, others in traditional saris or sober school uniforms. The sidewalks are a service sector in their own right, with everything from shoe shiners to bike repair shops, flower sellers, and food vendors. But even the street vendors use smartphones to take orders and track delivery. After all, this city in southern India, formerly known as Bangalore, is a hub on the data highway that spans the globe.

BIG DATA

This buzzword refers primarily to the processing of large, complex, and rapidly changing sets of data that can no longer be evaluated using conventional computers, but requires high-performance computing clusters. The data collected can come from any number of different sources. Around the world, the amount of data doubles roughly every two years. Sri Krishnan V, vice president of Bosch Engineering and Business Solutions in India, compares big data with an avalanche, albeit in a positive sense: “We would like to see data analytics ingrained in our company’s DNA. The more data we have, the better our analysis.” Bosch takes a two-step approach: first, data is collected from things and processes. Second, the data is used to develop services.

DATA METROPOLIS BENGALURU

Bengaluru is the capital of the state of Karnataka. With close to ten million residents, it is the third largest city in India. In recent years the “Garden City” has become one of the country’s key IT centers, and is one of the world’s hubs for big data. The reason for this lies in Bengaluru’s openness and its wealth of highly trained local IT experts, a large proportion of them women. India’s ambition is to transform itself from a low-cost production location to a leading supplier of complex, high-quality manufacturing and services. Big data is now being taught at all major educational institutions, including the Indian Institute of Science in Bengaluru. The institute is home to the Robert Bosch Center for Cyber-Physical Systems.

DATA MINING

This term refers to the systematic application of statistical methods when evaluating big data. The powerful algorithms used in this approach are nothing new, but the challenge today lies in efficiently applying them across clusters of interconnected servers comprising the thousands of processors now required to process the deluge of data generated daily by the internet of things. “Nowadays, everything is connected with everything else,” the analyst Lavanya Uppala says. When it comes to applying data mining in practice, she and her team use the “V formula”: variety, velocity, volume, vicinity, and visualization.

Linie

When, in 2011, the Bosch subsidiary Robert Bosch Engineering and Business Solutions was on the lookout for digital trends to diversify its business portfolio, Sri Krishnan and his team identified data mining as one of the possible new business fields for Bosch in India. For the country that contributed the concept of zero to the world of mathematics, it is only natural that it should be an early adopter of this technology wave. Drawing the right conclusions from huge amounts of data sounded pretty straightforward, but turned out to be a highly complex undertaking. “We didn’t let it remain a theoretical exercise; instead, we got straight to work and turned it into a business model,” Sri Krishnan says. They had the full support of Volkmar Denner, the chairman of the Bosch board of management, since big data is one of the elements that is paving Bosch’s way into the connected world.

After all, data is the oil of the global economy. When the India team took its first steps in this area, the Bosch Research and Technology Center in California was already exploring this new field. Since then, the data scientists in the U.S. and in India – 12,500 kilometers apart – have joined forces in a newly established agile service team. The team members share the findings of their data analytics projects and use them as the basis for deriving best-practice solutions. Currently, more than 50 data scientists share in the digital back and forth between the two locations. Hauke Schmidt – the head of the global data mining organization – and Lavanya Uppala confer with each other nearly every day, the former in Palo Alto and the latter in Bengaluru. They run their teams like a start-up business. 

Bosch embraces big data, and the associated analytics algorithms benefit its core business. In the information society, mass is the basis for quality. The ability to generate new knowledge from big data is a key competence of the future. That’s why, on the new Renningen research campus and in California’s Silicon Valley, the corporate research sector is not only concerned with practical applications, but also has its own expert teams dedicated to developing new methods of evaluating increasing amounts of data.

Lavanya Uppala and her data-mining team meet regularly to exchange information.

Algorithms lead to better products: in India, the members of the data mining team are searching for the right formula for greater customer benefit. Above left: a group lasso formula devised by the Bosch data mining group for big-data applications in manufacturing. It resulted in optimized inspection processes at the Bosch Rexroth plant in Homburg, Germany, and ultimately to improved valves for agricultural machinery.

Sri Krishnan (second from left) spearheaded Bosch India’s foray into big data at the Bengaluru location from an early stage. Today, 50 data experts work in this promising field.

The first data mining trials at Bosch manufacturing facilities in India resulted in an immediate major improvement to processes. A similar success was scored in the open market by a pilot project with a major railway company; the goal of the project was to get a handle on electronic ticket fraud. “In every internal project, we generate new knowledge for customer projects – and vice versa. This benefits both sides,” says Lavanya Uppala. One of her favorite examples concerns BSH Hausgeräte GmbH. When a customer reports a fault, the patterns derived from big data allow a conclusion as to the most likely cause – and as to the spare part needed – to be drawn very quickly. The benefits of this method of finding solutions faster will soon also be available to car drivers who come to Bosch Car Service for repairs and maintenance.

Data is increasingly generated along the entire life cycle of connected “things” – from their development, manufacture, and delivery to their use and maintenance. “Our experience lies in analyzing the data from such industrial processes and sensor streams to predict actions that optimize the use of materials, energy, and resources – and in the process generate huge commercial value. This is in line with our ‘Invented for life’ ethos,” Schmidt says. Compelling evidence for this assertion can be found in the automotive aftermarket. When it applied this approach in partnership with an international automaker, Bosch was able to identify potential warranty cases earlier and improve diagnostics readiness at the automaker’s service centers. Uppala sees this as a perfect example of the interplay between the two teams: “We conducted the analysis, and our colleagues in the Automotive Aftermarket division made the diagnosis. Together, we were able to quickly offer a solution.”  

To get to the office of the data scientist Rama Mohan D in Bengaluru, visitors have to pass an emergency cabinet, with axes and helmets mounted behind glass. There’s hardly a better visual for the subject of data mining. It’s all about digging deep, extracting, and mining in a constantly growing mass of data. A single set of data of the kind used by analysts comprises 1,000 columns and 20 million rows. And the volume of data keeps growing – by three to four terabytes a year in some projects.

The formula that the Bosch associate appears to be writing so casually on a magnetic board will ultimately be the key with which to wrest a solution and a new business model from ostensibly impenetrable heaps of data taken from various sources. Regardless of where the data comes from – connected industry, social media, or handwritten records – it has to be cleaned up, organized, validated, and prepared. People will always have a key role to play in big data, and not just as programmers. They first have to ask the right questions in order to collect the necessary data, and then compare the analysis with the experience and expectations of practical application. “More than anything else, we have to understand not only the data, but the customers as well,” Mohan D says. “That’s the only way they will be able to gain valuable insights from the data later on.” And it is only then that the big data formula will make sense.