What can drones learn from bees?

Share this on social media:

Tags: 

Dr Andrew Schofield, who leads the Visual Image Interpretation in Human and Machines network in the UK, asks what computer vision can learn from biological vision, and how the two disciplines can collaborate better

What can drones learn from drones? Almost every month the national news feeds carry a story about the latest development in drone aircraft, self-driving cars, or intelligent robot co-workers. If such systems are to achieve mass usage in a mixed environment with human users they will need advanced vision and artificial reasoning capabilities and will need to both behave, and fail, in ways that are acceptable to humans.

Setting aside recent high profile crashes in self-driven cars, the complexity and un-reliability of our road systems mean that driverless cars will need to act very much like human drivers. A car that refuses to edge out into heavy traffic will cause grid lock. Likewise a drone should not fail to deliver its package because the front door at the target house has been painted since the last Google Street View update.

So what can drones learn from drones – or, to be more precise, worker bees? In surveillance they may have similar tasks: explore the environment looking for particular targets while avoiding obstacles and eventually return home. They also have similar payload and power constraints: neither can afford a heavy, power hungry brain.

The bee achieves its seek and locate task with very little neural hardware and a near zero energy budget. To do so it uses relatively simple navigation, avoidance and detection strategies that produce apparently intelligent behaviour. Much of the technology for this kind of task is already available in the form of optic flow sensors and simple pattern recognisers such as the ubiquitous face locators on camera phones. Even the vastly more complex human brain has within it separate modules or brain regions specialised for short range sub-conscious navigation via optic flow and rapid face detection. However, the human brain is much more adaptable and reliable than even the best computer vision systems.

The Visual Image Interpretation in Human and Machines (ViiHM) network[1], funded by the Engineering and Physical Sciences Research Council, brings together around 250 researchers to foster the translation of discoveries from biological- to machine-vision systems. This aim is not new. In the early days of machine vision there was a natural crossover between these two fields. The Canny[2] edge detector for example computes edges as luminance gradients in a blurred (de-noised) image and then links weaker edge elements to stronger ones. This method has its roots in Marr and Hildreth’s[3] model of retinal processing plus the contour integration mechanisms found in visual cortex.

More recent examples of biology inspired processing include Deep Convolutional Neural Networks (DNN)[4], which have multiple convolution-based filtering layers separated by non-linear operators and down sampling to achieve increasingly large-scale and complex filters until finally classifications can be made. This structure is very similar to and loosely modelled on the multiple feature detection layers and receptive field properties of biological vision systems. Alternatively the SpikeNet[5] recognition system has a similar convolutional structure but more directly models the production of neuron action potentials. The relationship between machine and biological vision is symbiotic: convolution filters developed for machine vision are used to model biological processing and DNNs have been applied to human behavioural data to characterise the visual system.

However, in recent decades the biological and machine vision communities have diverged. Driven by different success criteria – a desire to understand specific visual systems on the one hand and to rapidly build working engineering solutions on the other – the two disciplines have developed different priorities, and ways of working. The ideal development cycle where observed phenomenon are explored in biology, results modelled computationally, and those models turned into useful applications can be protracted and requires multiple skill sets. The chain is often broken as academics on the biological vision side rush to publish their findings and get on with the next experiment while those working in industrial vision rightly employ any and every tool in the quest for better performance. Progress is hindered by language and understanding barriers with different terminology used even for the most basic concepts.

To counter this separation ViiHM has developed a triad of Grand Challenges[6] for intelligent vision where we think success can best be achieved by working together. The overall aim is to produce a general purpose, embodied, integrated, and adaptive visual system for intelligent robots, mobile and wearable technologies. Within this scope the Application Challenge is to augment and enhance vision in both the visually impaired and normally sighted, and to develop cognitive and personal assistants that can help those with low vision, the elderly, or simply the busy executive to deal with everyday tasks. Such aids might extend from wearable technologies that secretly prompt their user, to fully autonomous robots acting as caregivers and personal assistants. Here it is important that robots think and act like humans while avoiding the ‘uncanny valley’ effect[7] – where people are repulsed by robots appearing almost, but not exactly, like real humans.

These applications will be underpinned by the Technical Challenge of making low-power, small-footprint vision systems. To be acceptable, intelligent visual systems need to run all day on a single charge and be realised in discreet wearable devices. Such power and space savings can be achieved by learning how biological systems are implemented at the physical as well as algorithmic layer. Finally the Theoretical Challenge of general purpose, integrated and adaptive vision will see visual systems that can operate ‘out of the box’ and in the wild, but continuously adapt to and learn from their environment.

Learning the behaviours of their users and co-workers, such systems will be robust and flexible. They will fail gracefully and in ways that are acceptable to the humans they co-operate with. They will, for example, be able to identify people and places despite quite gross changes, to safely navigate new and altered environments and learn from experience over very long periods of time with fixed and limited memory capacities. These are tough challenges but biology has shown them to be solvable.

--

Andrew Schofield has a BEng in Electronics Engineering, a post-graduate diploma in Psychology and a PhD in Neuroscience. He is currently a senior lecturer in psychology at the University of Birmingham and member of the Centre for Computational Neuroscience and Cognitive Robotics. He also leads the ViiHM Network. ViiHM is open to new members and is currently seeking industry partners to take part in a grant writing event in July 2017.

Related article:

SLAM: the main event - Greg Blackman reports from a KTN-organised image processing conference, where event cameras and the future of robotic vision were discussed

References

1. http://www.viihm.org.uk (accessed 28/3/2017)
2. Canny, J. (1986) A Computational Approach To Edge Detection, IEEE Trans. Pattern Analysis and Machine Intelligence, 8(6):679–698.
3. Marr, D., Hildreth, E. (1980) Theory of Edge Detection, Proceedings of the Royal Society of London. Series B, Biological Sciences, 207: 187–217, doi:10.1098/rspb.1980.0020
4. Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) ImageNet Classification with Deep Convolutional Neural Networks in Advances in Neural Information Processing 25, MIT Press, Cambridge, MA
5. Masquelier T., Thorpe S.J. (2007) Unsupervised learning of visual features through spike timing dependent plasticity, PLoS Comput Biol 3(2): e31. doi:10.1371/journal. pcbi.0030031
6. http://www.viihm.org.uk/grand-challenges/ (accessed 28/3/2017)
7. Mori, M. (2012) Translated by MacDorman, K. F.; Kageki, Norri. The uncanny valley, IEEE Robotics and Automation. 19 (2): 98–100. doi:10.1109/MRA.2012.2192811

Related analysis & opinion

28 August 2018

Technology that advances 3D imaging, makes lenses more resistant to vibration, turns a CMOS camera virtually into a CCD, and makes SWIR imaging less expensive, are all innovations shortlisted for this year’s Vision Award, to be presented at the Vision show in Stuttgart

Zeiss's Smartzoom 5 digital microscope can remove glare from images by using angular illumination

23 May 2019

Reporting from the EMVA’s business conference in Copenhagen, Greg Blackman discovers how angular illumination and computational imaging can dramatically improve the resolution of a system

05 April 2019

Greg Blackman reports on the complexities of training AllGo Systems' driver monitoring neural networks, which the firm's VP of engineering, Nirmal Kumar Sancheti, spoke about at the Embedded World trade fair

09 October 2018

A group at the University of Bologna is trying to make images from Grand Theft Auto more realistic so that they can act as training data for neural networks. Greg Blackman listens to Pierluigi Zama Ramirez’s presentation at the European Machine Vision Forum in Bologna in September

24 May 2018

Data is now a fiercely guarded asset for most companies and, as the European General Data Protection Regulation (GDPR) comes into force, Framos’ Dr Christopher Scheubel discusses potential new business models based on 3D vision data, following a talk he gave at the Embedded Vision Summit in Santa Clara this week

Related features and analysis & opinion

28 August 2018

Technology that advances 3D imaging, makes lenses more resistant to vibration, turns a CMOS camera virtually into a CCD, and makes SWIR imaging less expensive, are all innovations shortlisted for this year’s Vision Award, to be presented at the Vision show in Stuttgart

Zeiss's Smartzoom 5 digital microscope can remove glare from images by using angular illumination

23 May 2019

Reporting from the EMVA’s business conference in Copenhagen, Greg Blackman discovers how angular illumination and computational imaging can dramatically improve the resolution of a system

05 April 2019

Greg Blackman reports on the complexities of training AllGo Systems' driver monitoring neural networks, which the firm's VP of engineering, Nirmal Kumar Sancheti, spoke about at the Embedded World trade fair

29 March 2019

Greg Blackman reports from Embedded World, in Nuremberg, where he finds rapid progress in technology for imaging at the edge

16 November 2018

Online retail sales in the US exceeded $453 billion in 2017, according to the US Department of Commerce. Although this may seem like a substantial amount, it only accounts for 13 per cent of the total retail sales made in the region throughout the year, meaning the majority of transactions still take place via the millions of customers walking through their doors and aisles every day.

09 October 2018

A group at the University of Bologna is trying to make images from Grand Theft Auto more realistic so that they can act as training data for neural networks. Greg Blackman listens to Pierluigi Zama Ramirez’s presentation at the European Machine Vision Forum in Bologna in September

20 June 2019

The UK is up to 20 per cent less productive than its major competitor countries because it is not investing in automation, Mike Wilson at the British Automation and Robot Association said at UKIVA's machine vision conference in Milton Keynes. Greg Blackman reports

29 March 2019

Andrew Williams explores the vision solutions for robot bin picking