Do you speak neural network?

Share this on social media:

Tags: 

Neural networks will be the common language of the future for computer vision, according to Professor Jitendra Malik from UC Berkeley. Greg Blackman listened to his keynote at the Embedded Vision Summit in Santa Clara

(Credit: ktsdesign)

Neural networks will be the primary language of computer vision in the future, rather like English is the common language for the scientific community. At least that is the hope of Professor Jitendra Malik at the University of California at Berkeley – Malik was speaking at the Embedded Vision Summit, a computer vision conference organised by the Embedded Vision Alliance and held in Santa Clara, California from 1 to 3 May.

Half of the technical insight presentations at the conference focused on deep learning and neural networks, a branch of artificial intelligence where the algorithms are trained to recognise objects in a scene using large datasets, as opposed to the traditional method of writing an algorithm for a specific task.

Jeff Bier, the founder of the Embedded Vision Alliance, when introducing Malik for his keynote address at the conference, said that 70 per cent of vision developers surveyed by the Alliance were using neural networks, a huge shift compared to only three years ago at the 2014 summit when hardly anyone was using them.

Deep learning has also reached the industrial machine vision world to some extent, with the latest version of MVTec’s Halcon software running an OCR tool based on deep learning, and Vidi Systems, now owned by Cognex, offering a deep learning software suite for machine vision.

Malik went further than saying neural networks merely have their place in computer vision, to suggesting that deep learning could be used to unite different strands of computer vision. He gave the example of 3D vision, for which algorithms like simultaneous localisation and mapping (SLAM) have traditionally been used to model the world in 3D, and for which machine learning hasn’t been thought suitable. He said that the world of geometry – which techniques like SLAM fall into – and machine learning need to be brought together.

A human will view a chair, for instance, in 3D, as well as being informed by past experiences of other chairs he or she has seen. Geometry and machine learning are two very different languages in computer vision terms, and Malik said, similar to scientists communicating in English, so a common language should be found in computer vision. ‘In my opinion, it is easier to make everybody learn English, which in this case is neural networks,’ he said.

There are neural networks that start to combine the two worlds of thinking, but Malik noted that putting geometrical data in the language of neural networks requires a fundamental breakthrough. He added that, over the next couple of years, he believes this marriage of geometrical thinking and machine learning-based methods will be achieved.

Malik noted another exciting area of computer vision research is training machines to make predictions, namely predictions about people and social behaviour. This involves work on teaching machines to recognise actions and to make sense of people’s behaviour in light of their possible objectives.

He also suggested that computer vision scientists should take note of the research carried out in neuroscience, since deep neural networks are originally based on findings in neuroscience. ‘Neuroscientists found phenomena in the brain which led us down this path,’ he said, adding that researchers should keep looking in the neuroscience literature to see if there are things that should be exploited.

One other problem in computer vision that Malik felt needed addressing was solving that of limited data. Neural networks learn about the world around them using masses of data, but there will always be instances where there isn’t enough information. He gave the example of work being carried out at the Berkeley Artificial Intelligence Research Laboratory, whereby a robot taught itself to manipulate objects by poking them repeatedly. It’s not being trained explicitly, but it is teaching itself. The work uses two different models – a forward one and an inverse one – and the interplay between them gives an accurate means of decision making for the robot.

Company: 

Related analysis & opinion

Stemmer Imaging has written a GenICam driver for Intel’s RealSense 3D camera

22 May 2019

Greg Blackman reports on the discussion around embedded vision at the European Machine Vision Association’s business conference in Copenhagen, Denmark in mid-May

15 April 2019

Greg Blackman reports on CSEM's Witness IOT camera, an ultra-low power imager that can be deployed as a sticker. Dr Andrea Dunbar presented the technology at Image Sensors Europe in London in March

22 February 2019

Ron Low, Framos head of sales Americas and APAC, reports from Framos Tech Days at Photonics West in San Francisco where Sony Japan representatives presented image sensor roadmap updates

20 February 2019

Jeff Bier, founder of the Embedded Vision Alliance, discusses the four key trends driving the proliferation of visual perception in machines

19 February 2019

Greg Blackman reports on CEA Leti's new image sensor, shown at Photonics West, which contains onboard processing and is able to image at 5,500 frames per second

Related features and analysis & opinion

Stemmer Imaging has written a GenICam driver for Intel’s RealSense 3D camera

22 May 2019

Greg Blackman reports on the discussion around embedded vision at the European Machine Vision Association’s business conference in Copenhagen, Denmark in mid-May

15 April 2019

Greg Blackman reports on CSEM's Witness IOT camera, an ultra-low power imager that can be deployed as a sticker. Dr Andrea Dunbar presented the technology at Image Sensors Europe in London in March

29 March 2019

Greg Blackman reports from Embedded World, in Nuremberg, where he finds rapid progress in technology for imaging at the edge

22 February 2019

Ron Low, Framos head of sales Americas and APAC, reports from Framos Tech Days at Photonics West in San Francisco where Sony Japan representatives presented image sensor roadmap updates

20 February 2019

Jeff Bier, founder of the Embedded Vision Alliance, discusses the four key trends driving the proliferation of visual perception in machines

19 February 2019

Greg Blackman reports on CEA Leti's new image sensor, shown at Photonics West, which contains onboard processing and is able to image at 5,500 frames per second

15 November 2018

Greg Blackman reports on the buzz surrounding embedded vision at the Vision Stuttgart trade fair, which took place from 6 to 8 November

31 October 2018

As the worldwide machine vision market continues to expand – with new trends emerging and new elements coming into play that could impact existing business models – companies are searching for those ever-important opportunities to stimulate growth.

One such trend is almost certainly embedded vision, although the technology behind it is not new, as Mark Williamson, managing director at Stemmer Imaging, noted: ‘Embedded vision is a big topic. However, it has been here a long time, because every smart camera that you buy is an embedded vision system.’