To Infinity and Beyond: Machine Learning and Data Conversion

Machine Learning (ML) and Artificial Intelligence (AI) are the current buzz words. Everyone is claiming they have it and that it will do wonders and solve the world’s problems. But what is it? Is it like The Terminators Skynet, The Matrix, Ex Machina or Isaac Asimov’s I Robot? Well, right now, no, but how long before movie Sci-Fi becomes reality? We already have driverless cars, which was subject of one of Isaac Asimov’s short stories, Sally, back in 1955.

ML is essentially a set of programs that don’t use hard and fast rules to give an output; they use patterns that they have seen before to determine what the answer should or could be. ML ‘learn’ by mapping input information to output answers. Just like we did when we first started school. Remember the picture cards,

=CAT, =DOG

After we saw them often enough, we learned what a CAT was and what was a DOG. But think what would have happened if the teacher got the cards mixed up! We would have learned the wrong thing, and these machine learning programs are the same. Give ML the wrong answers when teaching them, and they will continue to produce the incorrect output. It is imperative that when we explain the programs that we match the input with the right output. ML learn by the ‘rote learning method.’ In addition, when the program gets it wrong, we ensure there is a correction loop, providing the correct answer, just like teachers do in school with their students.

People also claim that their ML engines can do anything, and that the same algorithms can be used for every problem we are trying to solve. This is not the case. There are many different flavors of ML algorithms for example Deep Neural networks, Neural Nets, Image recognition, Logistic regression, Clustering, Word Embeddings, Elasticsearch, Probabilistic graphical models, AI search and so on. Different algorithms are better than others for different things. Just like at school, the way we learn languages was different to the way we learnt Math and the way we learnt Science, the different ML algorithms work and ‘learn’ differently. But they have something in common, just like us, it is better to start simple and then move to the more complicated. Teach words first, before whole sentences then move on to novels, teach addition before multiplication and then move to calculus. Our experience tells us that it is easier to break our big complex problems into nice easy to chew pieces, solve each of the small pieces and then by joining all these solutions together we solve the big one. ML is the same, when you are trying to solve a problem with ML, first break it into smaller constituent steps, choose the best ML approach for each one (one shoe does not fit all 😊) and then join the small steps together to get the full solution.

In the end, the ML engine uses its experience to suggest an answer. In the example of the picture cards, we saw many different pictures of CATS and learned the numerous variations. The more we saw, the better our answer would be when we were shown a picture we hadn’t seen before. We could interpret and identify that it was a CAT. ML is the same; the more variations you can feed it, the better it answers will be, in this case, variety is the spice of life. If the ML engine only sees one set of data, when the input is something it has not to seem before, its interpretation won’t be as good. The results you get from the ML engines are still statistical. It is the best fit ML can come up with. How do we get better confidence in the answer? The best way is the same way you do it with people, show the picture to 3 different people, and if they all say CAT then pretty sure that CAT is right, it is unlikely that all 3 will say the wrong answer, but if one of them says DOG, then maybe you get an expert to look at the pic and say what it is. ML can do the same, try to use more than 2 ‘different’ algorithms (not the same one, that this like showing the picture to the same person three times, you would expect the same answer 😊). If the answer you get from the different algorithms is the same, then the confidence level that it is right is high if they are different, then have an expert look at it and provide the correct answer back so the algorithms can learn and get it right next time.

At Utopia, we have been using ML for about three years. The main application we have been using it for is to take unstructured (or poorly structured) data for MRO Material descriptions and convert it to nicely structured, cleansed, standardized, and sometimes even enriched data. We move from this

To this

The use of ML has enabled us to streamline our approach, accelerate our delivery for clients and improve the quality of the output.

This quick read is part 1 of a 5 part series on leveraging data for operational, organizational and asset excellence. Doing more with less; working smarter, not harder and leveraging the advancements available today to seize the opportunities of tomorrow. Published weekly, this series will provide you the tools and tactics to gain a competitive advantage with high-quality, accurate and well-governed data.

To Infinity and Beyond: How Machine Learning is Accelerating Delivery, Maximizing Quality and Streamlining Efficiency in Data Conversion

Contact us today for a 15-minute discussion with one of our Subject-Matter Experts