Part 3: Applications of Artificial Intelligence  

Zooming into five exciting applications  

Juri van de Gevel, Sjors Broersen en Carmen Wolvius - May 2017 - Deloitte

Having now reviewed the meaning of AI and the the main techniques it involves in:

Let us take the next step and discuss five exciting applications in which we will see great development in the coming years:

  • Image recognition
  • Speech recognition
  • Translation
  • Q&A / chatbots
  • Games

These developments will make applications cheaper and more accurate, opening the door for business to use them.  

1. Image recognition

Recognizing images is an easy task for most of us. We do not have any trouble differentiating a car from a tiger or recognizing that a car is still a car when you observe it from the front instead of from the side. This task has been proven considerably more difficult for computers, but recent progress in image recognition accuracy has resulted in interesting applications. Because different vendors like Google and IBM are offering their preprogrammed algorithms open source and software libraries like Tensorflow make it possible to construct your own algorithms, visual recognition is becoming more accessible for the public.

Well-known applications of image recognition are Google’s shopper app or facial recognition for security cams. Such applications are already using image recognition on a daily basis, however there has been a lot of development in other areas over the last few years. IBM Watson, which we know from playing Jeopardy, has developed its image recognition skills in the field of medicine. IBM Research has been working on deep learning techniques for computer vision that could be used to recognize whether skin irregularities are melanoma. They created an ensemble of methods that can segment skin lesions and methods that can detect the area and surrounding tissue for melanoma and tested it on a large publicly available dataset, which they describe in a pre-print of the article. The vision of IBM is that at a certain point medical staff can send a picture of skin irregularities to Watson, the same way that they send blood sample to the lab.

Facial recognition, which we mainly know from security cameras, has also been developed in other areas. A survey of 150 retail executives by Computer Services Corporation, which was held in the UK in 2015, suggested that a quarter of all British shops use facial recognition software. The software is used for security, as one might expect, but also to track customers to observe their behavior as an effect of product displays, or the traffic flow in the store. This is a familiar concept in web shops, where A/B testing can be used to see which website display yields the best profits, however this suggests that facial recognition tools can be used to orchestrate these tests live in a store.

Different software development companies are offering facial recognition for retailers (for example here or here). They apply specific algorithms that use facial landmarks to recognize and distinguish between faces, which can be saved and later matched to enhance customer experience and personalize service.

2. Speech recognition

Speech recognition is an AI application that recognizes speech and can turn spoken words into written words. It is hardly used on its own but it is largely used as an addition to Chatbots, virtual agents and mobile applications. Well known examples are Apple’s Siri, Google Home and Amazon's Alexa. Speech recognition started already in 1952 with ‘Audrey’. Audrey was able to recognize digits spoken in a single voice, which is quite impressive given the computers back then. Today we have applications on our phone and in our car that can respond to our voice.

Not only the amount of applications with voice recognition capacity has increased, also the accuracy of voice to words has improved dramatically over the last few years according to KPCB.

One of the business applications that has gained quite a lot of ground is the use of speech recognition in health care. A lot of physicians are working with an electronic health record (EHR) to document patient information, however this has been said to delay the consults and to restrict the patient narrative. Using speech recognition, patient documentation can be recorded in a flexible and fast manner, which allows the physician to pay more attention to the patient. This solution is already offered by different vendors such as Nuance and M*Modal.

The rapid improvement of speech recognition accuracy offers many opportunities in the near future. Having all our soft- and hardware voice controlled might not be as far away as many people think.    

3. Translation

A different topic with large business implications is automatic translation. This topic can be defined as the process of translating text from one language to another by using software. Traditionally, translation was done by substituting each word by its closest counterpart in the other language. While this works reasonably well for single words, a pair of words or sentences are generally harder to process correctly due to the fact that relations between words are important for the meaning of a sentence, but such nuances cannot be captured when each word is analyzed separately.

The usage of deep learning has had a significant impact on the quality of machine translations by completely shifting the paradigm. Rather than working in a rule-based way, powered by human decision making, translation using a neural network is completely based on mathematics. On relatively basic texts, the GNMT system translations approach the quality of human translators. An experiment even showed that when you translate English to Korean and subsequently translate English to Japanese, the model is able to translate Korean to Japanese reasonably well, without any prior training focused on the formal link between the two languages. One article even asked the question: “have computers invented their own internal language?”.

The impact of quality translations in a global economy are enormous. With business translations originally dominated by conversions between European languages, the need for translations to Chinese, Japanese, and Korean is increasing. A simple example is one that Uber was investigating, where automatic translation takes place between you and your local Uber driver, who can only communicate in Chinese.

4. Question answering / chatbots

Q&A agents or Chatbots are another example of applying AI to language. When talking about the ability of having conversations, distinctions are made in the domain and the way of generating an answer of the agent. A chatbot can be focused on answering questions in an open or closed domain. When it operates in an open domain, it should be able to answer general questions that can concern any topic (see for example cleverbot). This is generally harder than a closed domain, which concerns only a limited amount of topics. Closed domains, however, have very good business application such as answering questions at helpdesks. A couple of years ago, there was a breakthrough in question answering interest, when IBM Watson beat humans in a game of Jeopardy, a well-known American quiz show. More recently another breakthrough was made by Google, who can now give chatbots the ability to have a short term memory, which gives the chatbot the ability to mimic real-life conversations more realistically.

The ability of machines to recognize intent (or the purpose) from a question and to answer it in a variety of ways, is again something that can be seen in many business applications. After an intent is distilled from the command of a user, it can be linked to a specific follow up action. This action can range anywhere from asking a return question to the retrieval of information from the internet. In the area of customer service, Chatbots are quickly becoming the norm, one example being IPsoft’s Amelia. Standard queries are already handled automatically, with only the difficult ones being forwarded to human decision makers. Question answering has also made an introduction in the field of Law, where lawyers can pose questions in natural language to an intelligent assistant about legal cases. The assistant can respond to the query with the relevant passage, drawn from high quality legal documentation.

5. Game / Solver

One of the most exciting applications of AI lies in playing games. Playing a game well requires you to not only know the rules, but to calculate the next possible moves within these rules, and finally make a careful judgement on which move would give you to best chance to win. If computers can play games as well as human players, there are no reasons why they cannot learn any other difficult task that people do in their daily work (although human supervision probably remains needed).

Recently there was a big step forward in the field of games when the world Champion of Go was beaten by a computer for the first time. Go is a game that cannot be brute-force calculated, since the number of possible moves is higher than the number of stars in the universe. The top Go players of the world rely for a large part on their intuition to come to the best moves. Google’s AlphaGo (a neural network based go-engine), however, learned how to play like a top human player by studying millions of human games. It then became even stronger by playing against another version of itself millions of times, which finally enabled it to beat the world champion. If computers can beat human players in one of the most complicated games that currently exists, then where do the possibilities for AI stop?

One big advantage people still have over computers, is that we can take our knowledge and training in one area, and apply it to a new task or area. For example, good go players can apply their way of thinking to solve their daily problems in their jobs. AlphaGo cannot do this: it is only good at playing go and nothing else. When you make it learn something else, like chess, it will lose its ability to play go. Recently however, a first step was taken in overcoming this problem: neural networks are now able to remember to most important knowledge from one game, and at the same time learn a new game (link). Google Deepmind wrote a new algorithm that allowed a neural network to learn 10 Atari games at the same time, and play them with human performance.

Once this field will be more developed, computers will be able to perform series of difficult tasks that at the moment only people can perform. Google themselves use it to lower the energy bills of their large datacenters. The AI controls over 120 variables in Google’s datacenters, such as the windows, fans and cooling systems, optimizing for energy usage while keeping computing performance up. The optimization potentially lowers Google’s energy bill for hundreds of millions over several years. Another application lies in healthcare, where an app of Deepmind saves nurses over two hours of time per day by warning about upcoming acute kidney failure. These are two applications, but since this is a newly developed field there is a huge potential for more. Think about predicting stock prices, optimizing the layout of distribution centers. Imagination and available data are the limit.

Hade du nytta av den här informationen?