SONAR: find labelling errors in databases Bookmark has been added
SONAR: find labelling errors in databases
Case 2 out of 16 projects of applied AI
Just under a year ago, Deloitte was approached by a major retailer. Its range consisted of over 30,000 products, and the commodity codes provided by suppliers had to be checked manually for around 600 new products every month. In addition, information had to be entered relating to the VAT rate and any local levies, such as the battery tax that applies in Belgium for products containing batteries. It was not unusual for something to go wrong when it came to this labelling. The retailer asked Deloitte for assistance in checking the information entered by human staff members.
“Previously, we would have carried out spot checks,” says Gerhard Smit, information architect and data analyst at Deloitte. “But then we thought, can’t we automate the checks?” Within one week, he and his team members created a proof of concept: Similarity Observant Network Analytics Report, or SONAR for short. It is a tool that predicts the likelihood that the entered information relating to VAT, the commodity code and local levies in a product database is correct.
AI email alert
Receive the latest AI casesSign-up
It works like this: a client supplies a data file containing as many details as possible – the commercial product description, the VAT rate, the commodity code and an indication of whether or not each local levy applies. But it also contains, for example, the barcode and other information that can assist with understanding the nature of the product.
SONAR compares this information against a customs database containing all commodity codes, a textual description for each commodity code, and the applicable rate of VAT. The comparison results in a percentage to indicate the likelihood that the label added by the client is correct. If a label is more than 80 per cent likely to be incorrect, for example, the product can be checked by a person.
A great deal of label-related work is simple, but new, innovative products often require additional attention. Smit: “Legislation often fails to keep up with reality,” remarks Smit. Take smartphones. Should we classify them as a phone, or as a navigation system, for example?” Such cases need to be assessed by an expert. SONAR allows checking of the vast majority of products to be automated, so that additional attention can be paid to the difficult cases.
The SONAR team went to a shop together with the client to test the tool, and carried out a random check on a shelf of bicycle lights. In the case of one bicycle light, SONAR indicated that something was likely to be incorrect regarding the battery tax. Smit: “Upon closer inspection, it turned out that there was indeed a small battery included in the packaging, although that wasn’t included in the description,” recalls Smit. We thought it was highly amusing: something we had built within a week had an immediate impact.”
SONAR was developed for a client, but Smit believes the technology is generic enough to be implemented for other problems. It works particularly well with databases containing at least 2,500 products, and a reference database must be available. Smit: “SONAR allows you to check the information entered by humans far more quickly and accurately,” asserts Smit. “And the best part about it is, the more often you use the technology and the more product information that becomes available, the more accurate the results will be.”
*) This case is part of the series of 16 Artificial Intelligence projects from Deloitte. Other cases in the series are in random order:
- TAX-I: A virtual legal research assistant
- AI Benchmark
- SONAR: Find labelling errors in databases
- Transaction detector with regard to the Dutch work cost regulations
- GRAPA: assistance with risk strategies
- Chatbot as a handy search tool for the online technical library
- Argus: an eye for detail
- PostNL: optimising delivery times
- Virtual assistants: beyond the hype
- HR agent Edgy: the future of Human Resources
- Using machine learning to assess risks for insurance policies
- Predicting payment behaviour
- DocQMiner: contract analysis performed in no time at all
- Combating welfare fraud with machine learning
- Using machine learning and network analytics to search for a needle in a haystack
- Clustering unstructured information in BrainSpace
Sign up for the email alert to get all cases through email.
Existing AI-techniques explained