The changing face of vision

Embedded vision, deep learning, and Industry 4.0 could all have a big impact on the machine vision sector in the future. Three experts give their opinions

Andreas Gerk, chief technology officer at Allied Vision, explores the developments happening in embedded vision, and how embedded processing boards are changing the machine vision sector

To understand the trends, challenges and opportunities of embedded vision, it is necessary to first clarify the concept of embedded vision. In our understanding, embedded vision is the merger of two different worlds, the first being embedded systems representing the corresponding embedded boards used in compact systems in a million different ways. Typically, embedded systems are small, lightweight and low-cost computing devices that can be embedded into a larger system, for example a car, a robot, a security terminal or a vending machine. They can also be mobile or battery-powered, such as in a video doorbell or body camera.

The second world is computer vision. Computer vision began as an experiment in artificial intelligence. The goal was to reconstruct the human visual system, ultimately applying visual perception for analysing a scene. This takes place with the aid of cameras, but also with algorithms developed for very diverse mathematical operations.

For several years there has been demand from the embedded system world for more powerful algorithms to run on embedded boards – for applications such as face recognition or deep learning. Embedded vision was born. The goal is the interpretation and explanation of images and videos within an embedded system.

When it comes to adding vision to an embedded system, the designer is confronted with several challenges, especially concerning the camera. One main question is: how much image processing can be executed in the camera and how much on the embedded board? Cameras for the embedded field nowadays are not known for executing much image processing, as they are not as rich in features as in the machine vision sector. They deliver a passable image to the embedded board. Further processing steps must be carried out on the embedded board, which burdens the CPU. This means there is less capacity for other tasks. And to choose a better performing board would raise the total cost.

Andreas Gerk

Another example is the question surrounding standard interfaces on the cameras. In this market, a lot of terms like USB, LVDS, MIPI CSI-2, or PCI Express are used. Here, the challenge consists in finding the right interface for an application and implementing it with as little effort as possible. In this case, USB counts among the most beloved interfaces. But it has one big disadvantage: packets have to be packed and unpacked when they’re sent. The CPU on the embedded board is burdened with additional expense, removing processing power that could be used for other tasks. For this reason, developers began choosing the MIPI CSI-2 interface, which is now in hundreds of millions of smartphones and tablets. With the MIPI CSI-2 interfaces, the CPU load is reduced compared to USB by up to 30 per cent. Moreover, CSI-2 is a uniform standard that is continually optimised by the MIPI Alliance.

Even though embedded vision is developing rapidly and dynamically at the moment, the traditional machine vision market will not be impacted in every application. Classic PC-based systems have not outlived their usefulness. There are still some basic differences that could make PC-based systems preferable for certain cases. They have the advantage that they are more powerful than embedded boards, which make them the preferred choice for more demanding algorithms or applications. They will still play an important role as all-rounders and take care of the overall performance of the system. Whereas embedded systems are designed for a single or few functions.

The full embedded system, with its embedded board, is designed for the necessary performance and cost-optimisation. This then leads to a situation where the embedded system cannot be upgraded, or only at a high cost. Here, especially, the strength of a PC-based solution or classic machine vision with its flexibility comes into play. On the other hand, the cost for an embedded system is much lower than a classic PC-based device. It is only a matter of time until more performance will be available on embedded boards. This will accelerate the transition from PC-based systems or classic machine vision to embedded systems or embedded vision.

There are similarities between computer vision, machine vision and embedded vision. However, evolving applications in both consumer and industrial markets make embedded vision an attractive market. Requirements for embedded vision are generating new approaches to vision technology, from cameras to processors and software algorithms. Embedded vision is a buzzword, and not yet well defined or understood, but based on the effort applied by significant players in the market, it has a bright future.

Allied Vision has introduced a camera platform targeted at the embedded vision market. The Allied Vision 1 product line, with an ASIC (application-specific integrated circuit) for image processing on-board, offers advanced digital imaging functionality with a list price starting at €99.


Dr Olaf Munkelt, managing director of MVTec Software, on where deep learning algorithms will play a role in industrial imaging

New technologies such as deep learning and convolutional neural networks (CNNs) have an enormous impact on automated process chains in industrial environments. This affects the machine vision sector in particular. The technologies make it possible to improve product identification and inspection significantly. Which intelligent deep learning functions can help here? And what types of application scenarios are conceivable with regard to machine vision?

Artificial intelligence methods are increasingly finding their way into industrial value creation processes, and are a typical feature of Industry 4.0 scenarios. Functions based on self-learning algorithms greatly contribute to the universal automation of processes. This is also the crucial foundation of many industrial applications in the area of robotics. Processes such as deep learning and CNNs therefore take effect in machine vision technology in particular. This is meaningful primarily for reliably identifying objects along the industrial process chain.

One important application in which deep learning technologies are used is optical character recognition (OCR). The process, which originated in office communication, is also being used much more in industrial settings. A wide range of fonts, as well as number and character combinations, printed or stamped onto workpieces, are recognised and read with the aid of OCR. Electronic image acquisition devices, such as scanners and cameras, generate raster graphics displaying the text with pixel-perfect accuracy from digital image information. OCR software processes the graphics, recognises combinations of numbers or characters contained therein, and combines them into words or entire sentences.

Compared to applications in an office environment, industrial OCR functions must meet stricter requirements. After all, the fonts in industrial settings in particular are often difficult to read and contain blurry or distorted characters or letters. Modern machine vision solutions are equipped with OCR functions and thus achieve high recognition rates under these conditions. This is precisely where innovative deep learning technologies, such as CNNs, are used, as the algorithms are able to analyse and evaluate large quantities of digital image data.

As a result, models of certain objects can be trained. Image data of the corresponding objects contains an electronic label for this purpose, which clearly indicates the object’s identity. The trained models are then compared against newly acquired image data, making it possible to assign the image data to a certain class according to content and motif. By dividing the objects into individual classes, they can be recognised automatically without requiring a sample image in each case. Deep learning algorithms are thus able to learn new things independently – a special feature of this technology. By assigning properties to a particular class, the technology is able to determine the exact class of the object – on its own and with a high hit rate.

Learning from errors

The process determines which special characteristics are typical for certain letters or numbers. This permits very high identification rates. Another special feature is that deep learning algorithms even learn from errors. If individual results do not apply, certain parameters are modified during the training process. The entire process then restarts and is repeated until an optimum training outcome occurs for the corresponding application. Compared to conventional classification techniques, deep learning comes with an advantage: the developer does not have to laboriously define features manually and check their suitability. They can use self-learning algorithms to find and extract unique patterns automatically.

Deep learning functions that permit robust object recognition based on alphabetical or numeric codes are integrated into modern machine vision solutions. The standard Halcon machine vision software from MVTec, for example, contains an OCR classifier based on deep learning algorithms, which can be accessed via many pre-trained fonts. This results in much higher identification rates than the current classification methods. A wide range of font types, such as dot print, Semi for marking silicon wafers, industrial and document-based font types, can be read and recognised with a single, universal, pre-trained classifier. With the new version of MVTec Halcon, released at the end of 2017, users will be able to train CNNs themselves based on deep learning algorithms. Defect classes, for example, can be trained solely through reference images.

Artificial intelligence, deep learning and CNNs are important trends that will play a key role in shaping factory automation in the coming years. In machine vision, the technologies are used primarily for object recognition purposes, and thus also for error identification and OCR. As a result, particularly robust object identification results can be achieved.

However, the technologies are not suitable for all uses, since they are often associated with enormous training effort, extensive expert knowledge and high investment.


Edwin Ringoot, strategic marketing and business development manager at On Semiconductor, discusses the advances being made in image sensors for future smart factories

As the latest trends in manufacturing automation drive the adoption of smart factories in Industry 4.0, the need for continued improvements in vision systems becomes critical. To drive this increasing level of automation and control, however, the components used to power these systems – including the image sensors in industrial vision cameras – must also advance in order to make this interconnected future possible. While the scope of these advances will ultimately be felt throughout the datasheet for any given device, the high-level trends involved can be grouped into a few main areas.

In nearly every imaging application, improved imaging capabilities typically go hand in hand with the need for increased resolution. For image sensors, this need can be met either by reducing pixel size – enabling higher resolution at a given optical format – or moving to larger devices. But for industrial imaging, this drive toward higher resolution does not remove the need to retain, or even improve, the frame rate available. In addition to higher resolutions, therefore, image sensors need to provide increased bandwidth – the amount of data available from the chip per second – in order to improve image resolution without slowing down the manufacturing line. As the speed available from computer interfaces such as USB 3.1, 10GigE and others continues to increase, it’s also important that the bandwidth available from the image sensor is not a data bottleneck in the overall system.

One key factor driving the implementation of Industry 4.0 is the rise of embedded imaging through the deployment of smart cameras across a factory floor. To be effective, these embedded cameras must be compact, operate with minimal power and available at low cost – factors made possible on multiple levels by the image sensors selected in the camera design.

For example, compact cameras are enabled not only by using image sensors with small pixels, but also architectures that minimise the support circuitry required to implement a full camera design. Power can be reduced by low voltage outputs on the image sensor, which enables the use of low-power components throughout the design. And costs can be controlled – not only for the image sensor but the entire bill of materials for the camera – through smart choices in the image sensor design.

In addition, while some industrial applications require the image quality and image uniformity currently available only from CCD sensors, many industrial imaging applications today routinely rely on CMOS sensors because of the expansive feature set – including high data bandwidth – available from this technology. While for many applications the image quality available from CMOS devices is more than sufficient, opportunities still exist to improve the overall image quality available from this technology, particularly in areas such as image uniformity through reduced fixed pattern noise.

To implement these advances, image sensor suppliers will need to rely not only on technologies developed specifically for industrial markets, but those for adjacent imaging markets as well. High dynamic range designs first developed for automotive and security imaging can be adapted for industrial vision in order to improve imaging capabilities in situations where lighting may not be well controlled. Small pixel architectures made initially for automotive or consumer markets can either be adopted directly or scaled to larger formats in order to provide increased pixel full well capacity. And making use of this type of technology can operate in both directions – global shutter pixel architectures developed for industrial vision can migrate for use into these adjacent applications. In addition, the use of emerging technologies such as 3D time of flight, and multispectral and hyperspectral imaging, will continue to expand as the imaging capabilities available from these techniques line up with the increased processing power required to take advantage of them.

While all of these image sensor trends – increased bandwidth, improved support for embedded vision systems, enhanced image quality and uniformity, and technology leveraging – could be grouped under the umbrella of ‘improved performance’, each has a separate but vital impact on the continuing adoption of industrial vision systems and their expansion into smart factories of the future. As the level of automation and systems intelligence used in manufacturing continues to increase, vision systems – and the image sensors that power them – will continue to play a vital role in the implementation of these technologies.


Wilhelm Stemmer, who has recently retired and sold his shares in Stemmer Imaging, the company he founded 30 years ago


Matthew Dale explores how 3D cameras are granting robots the gift of sight


Rob Ashwell finds that logistics, healthcare and research are turning to consumer imaging systems for their needs


Artificial intelligence seems to be sweeping the world and neural networks are now starting to find their way into the industrial imaging market. Greg Blackman investigates