More varied tasks required to truly test computer vision

Share this on social media:

Researchers have shown current tests of computer vision do not truly reflect the difficulties in viewing a natural, and varying, environment. The research suggests more complicated tasks are required if we are to measure developments in machine vision technology.

Conventional tests require vision systems to recognise objects contained in photographic image sets. State-of-the-art machine vision systems can usually correctly recognise objects about 60 per cent of the time, but James DiCarlo and his team from MIT, USA, believed these tests are too easy to give a real estimate of how the robots would fare in the real world, as the photographs frequently cover the same views and contexts, with very centralised, ‘obvious’ objects.

‘We suspected that the supposedly natural images in current computer vision tests do not really engage the central problem of variability, and that our intuitions about what makes objects hard or easy to recognise are incorrect,’ Nicolas Pinto, one of the researchers explained.

To test their theory, the team created a very simple ‘toy’ vision system, which was then set to compete with more advanced systems in these tests. The toy was designed to capture low-level information about the position and orientation of line boundaries, while lacking the more sophisticated analysis that happens in later stages of visual processing to extract information about higher-level features of the visual scene such as shapes, surfaces or spaces between objects.

When tested on conventional images, the simple ‘toy’ vision system performed just as well as the more advanced systems, suggesting that only the most basic layers of visual processing are necessary to perform the task.

The team then performed a more carefully controlled task with just two categories – planes and cars, introduced variations in position, size and orientation that better reflect the range of variation in the real world. Because it contained just two different types of objects, traditional thought would suggest that this test would be very easy for the simple system to distinguish. However, the variability meant that it actually performed very poorly, suggesting that it is a better measurement of visual processing capability.

Recent News

19 January 2021

Mark Radford, LMI’s chief operating officer, will succeed Arden as CEO. Arden will continue on a part-time basis with LMI

04 January 2021

The acquisition adds Flir's thermal and infrared imaging technologies to Teledyne's visible imaging capabilities

21 December 2020

Perceptron's 3D measuring solutions complement Isra Vision's 3D vision systems and, together, they offer products for automating key applications along the automotive production line

16 December 2020

Recycleye’s vision system is capable of detecting and classifying items in waste streams, broken down by material, object and brand