More varied tasks required to truly test computer vision

Share this on social media:

Researchers have shown current tests of computer vision do not truly reflect the difficulties in viewing a natural, and varying, environment. The research suggests more complicated tasks are required if we are to measure developments in machine vision technology.

Conventional tests require vision systems to recognise objects contained in photographic image sets. State-of-the-art machine vision systems can usually correctly recognise objects about 60 per cent of the time, but James DiCarlo and his team from MIT, USA, believed these tests are too easy to give a real estimate of how the robots would fare in the real world, as the photographs frequently cover the same views and contexts, with very centralised, ‘obvious’ objects.

‘We suspected that the supposedly natural images in current computer vision tests do not really engage the central problem of variability, and that our intuitions about what makes objects hard or easy to recognise are incorrect,’ Nicolas Pinto, one of the researchers explained.

To test their theory, the team created a very simple ‘toy’ vision system, which was then set to compete with more advanced systems in these tests. The toy was designed to capture low-level information about the position and orientation of line boundaries, while lacking the more sophisticated analysis that happens in later stages of visual processing to extract information about higher-level features of the visual scene such as shapes, surfaces or spaces between objects.

When tested on conventional images, the simple ‘toy’ vision system performed just as well as the more advanced systems, suggesting that only the most basic layers of visual processing are necessary to perform the task.

The team then performed a more carefully controlled task with just two categories – planes and cars, introduced variations in position, size and orientation that better reflect the range of variation in the real world. Because it contained just two different types of objects, traditional thought would suggest that this test would be very easy for the simple system to distinguish. However, the variability meant that it actually performed very poorly, suggesting that it is a better measurement of visual processing capability.

Recent News

14 September 2020

Messe Stuttgart has set a date of 5 to 7 October 2021 for the Vision show, which was cancelled because of the coronavirus pandemic. The show will be held in parallel with Motek

03 September 2020

Terahertz imaging company, Tihive, has been awarded €8.6m from the European Innovation Council's Accelerator programme to scale up its industrial inspection technology

01 September 2020

The robotics market in North America is down 18 per cent so far compared to the same period last year

13 August 2020

The company said that shifts in demand, particularly in the areas of sports and entertainment, infrastructure and automotive, had a negative impact