Thanks for visiting Imaging and Machine Vision Europe.

You're trying to access an editorial feature that is only available to logged in, registered users of Imaging and Machine Vision Europe. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Rethinking 3D vision

Share this on social media:

Tolga Birdal at the Technical University of Munich is co-organising a 3D vision workshop at the ICCV computer vision conference in October. Here, he argues that 3D vision's strength - geometry - is also what's holding it back

Depth image, pose estimation, cameras, geometry, point cloud, reconstruction and mesh. These are some of the keywords that computer vision experts come up with in response to the term 3D vision. Yet, for many others, these concepts do not appear to be more than a word cloud, creating a certain mystery among the non-professionals. Despite such a lack of common knowledge about the intricacies of 3D imaging, the impact of 3D vision has reached a remarkable level.

3D technology in factory automation is expected to be valued at $2.13 billion by 2022, according to market research firm MarketsandMarkets. Automotive, pharmacy, food and beverage, and many other sectors expect increased use of 3D components and software. So what is the hype about, which applications benefit from 3D vision, and why is 3D so crucial for healthier progress towards Industry 4.0?

First things first: The information richness in 3D is much greater than in 2D. 3D can generate accurate coordinate, distance or radius measurements, and can output 3D object or camera poses explaining their precise orientation in the real world. As 3D coordinates are simple positions, they do not necessarily contain intensity information meaning that any further processing is not reliant on illumination, as is the case for 2D image understanding.

From an application point of view, 3D perception has always been very beneficial in robotics-related fields like quality inspection, pose estimation, navigation, mapping and grasping. However, 3D algorithms mean much more can be done. It is visible in the AR/VR world, as companies like Microsoft for its Hololens, Magic Leap, Apple, Google (Daydream) or Meta develop remarkably competent solutions to completely shift our view of technology. The fact that systems are able to obtain very accurate 3D reconstruction is also having tremendous impact in 3D printing, for making prosthetics, for example, in healthcare. Autonomous driving is also about to enter our lives and this too is powered by 3D processing. The new focus of NASA and Elon Musk on space exploration is expected to involve 3D vision algorithms in one way or another. All these applications signal significant progress in 3D vision technology.

So, first, let’s take a look at what holds back 3D vision at the moment; I argue that it’s geometry. 3D data comes with geometric properties, whereas within the good old structured 2D domain geometry has been easy to ignore. In 2D images, algebraic approximations have been shown to be very robust against non-ideal geometric conditions, especially if the task itself is free of geometric requirements, such as recognition or identification. Yet, 3D data comes with attributes such as axis of symmetry or rotation, or natural sparsity, and this doesn't immediately allow geometry to be neglected. This is good news for our academic colleagues, as it creates room for research.

One aspect that 3D has not yet fully utilised is the power of machine learning. However, there is now a promising subfield called geometric deep learning, which aims to unite the bests of both worlds – geometric properties and machine learning – so as to maximise the strength of 3D vision. This is, in my perspective, the next leap forward for which the industry should be ready.

But how do engineers get the most out of 3D data? I would start with the right analysis of the problem and requirements, which, of course, is made possible by asking the right questions. Is a prior CAD model available? Are the objects symmetric; could the system benefit from partial symmetry? Can an approximate of the object be made by geometric primitives or is the object completely freeform? Does the system need coordinate measurements or distances; dense reconstruction or will sparse points also suffice? Will the system operate outdoors or indoors? Is the object of interest metallic, shiny or black? Can we trade off a cruder, but faster method over an accurate but slow one? Depending on the application such questions vary. Good questions give rise to useful constraints, which makes the solution engineering less tedious.

Contrary to common belief, not every aspect of 3D is more challenging than dealing with 2D. For instance, 3D alleviates the pesky process of light selection and illumination design. It eliminates the necessity to perform triangulation from multiple views and gives a natural interface to distance measurements. Most applications of 3D can benefit from online calibration, made possible by 3D registration. Thus, depending on the problem, a 3D solution can complement or fully replace a 2D solution, offering a more cost-efficient and robust system.

I am co-organising a workshop on multiple view relationships in 3D data, which will be held on 29 October in conjunction with ICCV in Venice, one of the best computer vision conferences. The goal is to foster discussions and boost knowledge dissemination in 3D vision of multiple cameras. To this end, we have managed to bring in a lot of great speakers and are looking for a high quality set of submissions – authors of all accepted papers receive exciting prizes! For further information, I highly encourage you to visit:


Tolga Birdal is a PhD candidate at the Computer Vision Group at the Chair for Computer Aided Medical Procedures, TUM and a Doktorand at Siemens. His research and development is focused on large object detection, pose estimation and reconstruction. Recently, he was awarded Ernst von Siemens Scholarship and the EMVA Young Professional Award 2016 for part of his PhD work.

Other tags: 

Related analysis & opinion

11 December 2018

Dr Guillaume Girardin, at Yole Développement, sets out some of the forces driving the growth of 3D imaging and sensing technologies

28 August 2018

Technology that advances 3D imaging, makes lenses more resistant to vibration, turns a CMOS camera virtually into a CCD, and makes SWIR imaging less expensive, are all innovations shortlisted for this year’s Vision Award, to be presented at the Vision show in Stuttgart

22 June 2018

Robot bin picking has been worked on for a number of years, and while it has been shown to be possible it’s only now that the technology is coming to fruition. Greg Blackman looks at what was on display at Automatica

24 May 2018

Data is now a fiercely guarded asset for most companies and, as the European General Data Protection Regulation (GDPR) comes into force, Framos’ Dr Christopher Scheubel discusses potential new business models based on 3D vision data, following a talk he gave at the Embedded Vision Summit in Santa Clara this week

Related features and analysis & opinion

26 July 2019

Matthew Dale explores the high-resolution imaging solutions emerging for inspecting OLEDs and other electronic displays

26 July 2019

As car makers install production lines for electric vehicles, Greg Blackman looks at how vision is currently used in their factories

26 July 2019

Keely Portway investigates how vision technology is being used in the dental sector, from initial diagnosis, to quality control of prostheses

29 March 2019

Andrew Williams explores the vision solutions for robot bin picking

19 February 2019

The agri-food industry is on the verge of a revolution thanks to advances in precision farming. Machine vision plays a crucial role in these advances, as Keely Portway finds out

19 February 2019

Greg Blackman explores some novel ways of imaging glass, including a 3D technique to measure the flatness of glass panels