1. Extra Categories
  2. Editor's Pick
August 22, 2018updated 18 Jul 2019 8:26am

Taking Machine Learning from theory to practice

By Dr Gero Presser and Jean-Michel Franco

In the last few years, Machine Learning has quickly gone from a niche subject to one with significant relevance to many companies and organisations.

Across industries ranging from pharmaceuticals and health care, to retail and financial services, Machine Learning has become more widely used for solving new business requirements. But what is Machine Learning, how does it work, and just how do you teach a machine to learn?

The “cat” demonstration

Computers are supposed to solve tasks for humans. The traditional approach is to “program” the desired procedure; in other words, we teach the computer a suitable problem-solving algorithm. The algorithm is a detailed description of a procedure, similar to a recipe. Many tasks can be described effectively by an algorithm. For example, in elementary school, we all learned the algorithms used to add numbers. When it comes to carrying out algorithms of this kind quickly and flawlessly, computers are far superior to humans.

However, this procedure has its limitations. How do we recognise a photo of a cat? This apparently easy task is difficult to structure as an algorithm.

Let’s pause for a moment and think about it. Even simple instructions such as “has four legs” or “has two eyes” have their drawbacks, because these features may be hidden, or the photo might only show part of the cat. Then we encounter the next task of recognising a leg or an eye, which is just as difficult as identifying a cat.

This is exactly where the strength of machine learning lies. Rather than having to develop an algorithm to solve the problem, the computer uses examples to learn the algorithm for itself.

We train the computer on the basis of samples. Using our cat example, this could mean that we train the system using a large number of photos, with those depicting a cat labeled accordingly (supervised learning). In this way, an algorithm evolves and matures that is eventually capable of recognizing cats on unfamiliar pictures.

Machine Learning vs Deep Learning

In this situation, the computer does not usually learn classical programs so much as parameters within a model, for example, edge weights within a network.

This principle can be compared with the learning process in our brain (at least, as far as we understand it), in which connections between nerve cells (neurons) adapt. Like the brain, and unlike a classical program, this network with its edge weights is virtually impossible for humans to interpret.

In this context, a special class of learning methods for artificial neural networks called deep learning has proven to be particularly successful. Deep learning is a specialization of machine learning, which in turn is a subdiscipline of artificial intelligence, a major branch of research in computer science.

As early as 2012, a Google research team successfully trained a network of 16,000 computers to identify cats (and other object categories) from images using 10 million YouTube videos. The procedure employed was deep learning.

A powerful tool for digital transformation

Although the principle of machine learning is not new, it is currently enjoying a surge in popularity. There are three main reasons for this: firstly, the availability of large quantities of data necessary for the applications and training (“big data”). Secondly, we now have the huge computing power required, especially in the cloud. And thirdly, a range of open source projects have led to algorithms being accessible to more or less everyone.

Take the example of Bayer Digital Farming GmbH, a unit of the Life Science company Bayer. The company has started a digital transformation project to ensure sustainable food production.

Weeds that damage crops have been a problem for farmers since farming began. A proper solution is to apply a narrow spectrum herbicide that effectively kills the exact species of weed in the field while having as few undesirable side effects as possible.

But to do that, farmers first need to identify the weeds in their fields accurately. Using Talend Real-time Big Data, Bayer Digital Farming developed WEEDSCOUT, a new application that farmers can download free.

The app uses machine learning and artificial intelligence to match photos of weeds in a Bayer database with weed photos farmers send in. It gives the grower the opportunity to more precisely predict the impact of his or her actions such as choice of seed variety, application rate of crop protection products, or harvest timing.

The technology used for weed-image recognition is based on self-learning algorithms. To help ensure the app’s answers are comprehensive and error-free, WEEDSCOUT’s image database must be fed with further weed images.

The system learns with each photo, thereby continuously improving the recognition process. In addition, the app can also be used offline — farmers can save pictures taken on the go and send them when a strong internet connection is again available.

Verdict deals analysis methodology

This analysis considers only announced and completed artificial intelligence deals from the GlobalData financial deals database and excludes all terminated and rumoured deals. Country and industry are defined according to the headquarters and dominant industry of the target firm. The term ‘acquisition’ refers to both completed deals and those in the bidding stage.

GlobalData tracks real-time data concerning all merger and acquisition, private equity/venture capital and asset transaction activity around the world from thousands of company websites and other reliable sources.

More in-depth reports and analysis on all reported deals are available for subscribers to GlobalData’s deals database.

Topics in this article: ,