Karsten Roth recognised for advance in anomaly detection

Karsten Roth, a PhD researcher with the Explainable Machine Learning group at the University of Tübingen, has won the EMVA Young Professional Award for work on a neural network for anomaly detection.

Roth was presented with the award at the EMVA business conference last week in Brussels.

PatchCore, which Roth developed during a research internship at Amazon AWS, is an automated visual anomaly detection method addressing the cold-start problem, i.e. the model only has access to non-defective example images during training. The model can determine defects without having seen them.

The model offers competitive inference times while outperforming competitor methods for both detection and localisation. On the challenging, widely used MVTec anomaly detection benchmark, PatchCore achieves an image-level anomaly detection Auroc score – a metric for classifying a model's performance – of 99.6 per cent, more than halving the error compared to the next best competitor.

It’s scalable and a lot more sample efficient too, according to Roth, and can match the previous state-of-the-art methods with as little as three per cent of the training data.

Roth will present the work at the Computer Vision and Pattern Recognition (CVPR) conference in June.

Memory bank subsampling

One of the keys to PatchCore’s performance is the subsampling method Roth used – coreset subsampling rather than random subsampling – to trim the memory bank. Coreset subsampling, unlike random subsampling, aims to retain overall coverage of the feature space in the memory bank.

PatchCore’s network will generate a feature representation for different locations in an image, which it then dumps in a memory bank. The danger is that the memory bank gets very large very quickly, and so subsampling is used to keep its size manageable.

The problem with random subsampling is that there’s the potential to drop rarely occurring feature sets. This is not the case for coreset subsampling. Speaking to Imaging and Machine Vision Europe, Roth explained: ‘[Using coreset subsampling] we are able to reduce the memory bank significantly with minimal drop in performance. This makes approaches that operate on this memory bank significantly – by orders of magnitude – quicker than ones that operate on the big memory bank, but without a drop in performance.’

Test images are then compared to the feature sets in the trimmed memory bank. If the feature is significantly different from the ones in the memory bank, it’s likely to be a defect.

‘The result is a method that has only ever seen normal data, but when you apply it to test data it is able to very accurately detect defects for all kinds of data and products,’ Roth said.

The MVTec anomaly detection benchmark gives 15 different anomaly detection tasks, with the final performance as an average across all the tasks. For each of the 15 anomaly detection tasks PatchCore achieves above 90 per cent Auroc with just five images of normal data; 15 images gives more than 95 per cent Auroc. After that there’s a diminishing return in that the method needs a lot more to bridge the last few percent. But even so PatchCore’s returns are a lot less diminishing than comparable methods, Roth said – the comparisons were made against SPADE and PaDiM.

Roth said the method has already been used in practice for anomaly detection on solar cell electroluminescence images. He also noted that the method has been replicated, meaning the concepts hold true beyond a specific implementation.

‘It’s very nice to receive this award for this work because it was research that was built around practical needs... instead of making the method complex and convincing from an academic point of view,’ Roth told Imaging and Machine Vision Europe. ‘We wanted something that works well in practice first and then try to convince the academic community of its merits.

‘There tends to be some disconnect between what industry needs and what academia publishes,’ he continued. ‘We were able to find a niche by going from application needs first to academic publication.’

The fact that the work has been accepted for CVPR in June shows its academic merits too.

Roth said the code was written in a way that it is scalable to the hardware the user has available. But, he added, ‘if you have a GPU you can make use of it really aggressively’.

‘We have extended PatchCore with a lot of computation tricks to run on a GPU, and if we do all of these tricks things are even faster,’ he said.

Roth is still optimising the code base. He said he’s looking to potentially develop the method for 3D anomaly detection.

The Explainable Machine Learning group at the University of Tübingen is part of the International Max Planck Research School for Intelligent Systems (IMPRS-IS) and the European Laboratory for Learning and Intelligent Systems (ELLIS). Roth is co-supervised by Zeynep Akata at the University of Tübingen, and Oriol Vinyals, a research scientist at Deepmind.

Roth completed both Bachelor and Master studies in Physics at Heidelberg University in 2021, and spent time in Canada as a researcher at the Montreal Institute for Learning Algorithms and the Vector Institute in Toronto.

The EMVA Young Professional Award honours outstanding work by a student or a young professional in the field of machine vision or image processing. The award encourages students to focus on machine vision challenges and to apply the latest research in computer vision to the practical needs of the industry.

The 2023 EMVA business conference will be held in Seville, Spain; the EMVA will celebrate its 20th anniversary there.

Karsten Roth recognised for advance in anomaly detection

Memory bank subsampling

Topics

Read more about:

Editor's picks

Zeiss Meditec Q3 results show recovery amid US tariff impact

On-demand webcast: Embracing edge computing for image processing

Beyond the visible: imaging in IR, NIR, SWIR, and hyperspectral

On-demand webcast: Overcoming lighting challenges: How to get the best out of light sources for imaging

Design and deployment advantages for 3D imaging devices

Selecting line scan camera technology: multi-sensor vs single-sensor solutions

Decoding the dilemma: build vs. buy in vision AI