Greg Blackman looks at the options for high-throughput image processing, including FPGAs and multicore processors, and finds out what unites development in all software packages is ease of use
Vision algorithms might be at the heart of an imaging library, but where a lot of the development is taking place is in making the libraries easier to use. This is the case right across the spectrum of software offerings, from standard machine vision libraries to 3D packages to tools for FPGAs. Mark Williamson, director of corporate development at Stemmer Imaging, comments: ‘Point-and-click software is growing quite considerably. I’m seeing a big move where people don’t necessarily want to program from first principles in C; they want a wizard where they can drag tools into position and program it that way.’
Stemmer Imaging, which produces its Common Vision Blox (CVB) library, as well as distributing software packages from Aqsense and Teledyne Dalsa, among other companies, is looking to bundle tools together to make them easier to use. ‘We’ve developed a plug-in for Aqsense’s 3D Express to make it compatible with Sherlock software from Teledyne Dalsa, which is a point-and-click package,’ states Williamson.
Aqsense has developed its 3D Express software to make 3D image processing easier to program. The module condenses the whole process of acquiring 3D data, i.e. detecting the triangulation line, calibrating the space, generating a point cloud, and creating a cut through that point cloud to present to 2D image processing tools. ‘The idea is that Aqsense has wrapped 3D processing into a wizard – the user goes through the steps, calibrates the system to get 2D planes through the image to run on 2D tools,’ explains Williamson. ‘It’s making 3D processing, which is conventionally quite complicated, much simpler.’
Point-and-click packages might be more in demand, but this is not to say image libraries are becoming defunct, far from it. Each has their place: a library will give more control and is typically used by OEMs implementing complex applications. Williamson states that some of Stemmer’s customers do very complex tasks that just couldn’t be achieved in a point-and-click environment.
For reasonably standard machine vision systems on a production line, however, most of the point-and-click environments are completely contained and include functionality like connecting to a PLC or supporting I/Os. ‘If you’re using a vision library, you’d need a different library to talk to the PLC and a separate card to support I/O – the user has to do everything from scratch with an imaging library,’ explains Williamson.
‘Point-and-click is growing significantly and we’re implementing a lot of tools from our CVB into Sherlock for this growing segment,’ states Williamson. ‘Why write a program for four days when you can string the algorithms together in a few hours using a point-and-click environment.’
One of the newer tools added to CVB is a video stabiliser and an Optical Flow tool, which quantifies the direction and amount of movement in an image. It is suitable for tasks like traffic monitoring, where it can give statistical information on the flow of vehicles, for instance.
Bringing processing up to speed
At the top end of machine vision, for which companies like Teledyne Dalsa cater, data throughput is such that vision systems often require some processing to be offloaded onto an FPGA or for the algorithms to be optimised for multicore systems to keep processing times low. Teledyne Dalsa’s Sapera APF and Silicon Software’s Visual Applets are both tools for programming FPGAs. ‘Typically the FPGA is the domain of hardware designers, whereas image processing is more of a software discipline,’ comments Inder Kohli of Teledyne Dalsa. ‘The challenge lies in making it easier for software engineers to program for image processing in hardware, which is what Sapera APF is designed for.’
Areas like semiconductor and flat panel display inspection, and web inspection, are the kind of applications that would use an FPGA; those with high throughput or those imaging large areas and making fine measurements. And if the systems include a frame grabber, then there’s an opportunity to do some pre-processing on an FPGA before the image is sent on to the host system for high-level analysis. Pre-processing could include Bayer decoding, correcting distortions, or correcting gain and offset.
‘In principle you could do anything in an FPGA,’ states Chris Hirst, FPGA IP development manager at Matrox Imaging, ‘but you probably wouldn’t want to do the high-level functions which would be very complicated to program.’
Matrox Imaging’s frame grabbers are designed to accommodate several different sizes of FPGAs depending on the customer’s needs. The company can customise an FPGA with specific functions for a particular application or provide its FPGA Development Kit (FDK) which has some blocks for basic processing functions and with which the customer can add their own custom blocks.
Dedicated software tools for FPGAs, like Teledyne Dalsa’s Sapera APF, are allowing software engineers to program FPGAs for image processing much more easily
Once again, FPGAs are now simpler to program through the availability of high-level language tools, typically C, instead of the more traditional hardware design flow with VHDL. ‘It’s really made an order of magnitude difference in time savings to implement a function,’ says Hirst. ‘Instead of building up a huge library of standard blocks that allow the user to customise their algorithm, now it’s easier to write custom functions. Development time is now more like days instead of weeks for a simple function or a couple of weeks instead of months for more complex functions, because the tools are so much better.’
In terms of the advantages of using an FPGA, Hirst quotes a possible saving of around 10-30 per cent of the host CPU cycles though offloading some of the processing. He’s also had customers with very complex pipelines and high data rates where it would have taken eight or ten host CPU cores just to process all the data without offloading it to an FPGA. ‘In some cases, an FPGA is really necessary,’ he says.
‘We see more and more customers using FPGAs,’ he adds. ‘The FPGA development tools used to be very expensive from third party vendors, whereas now FPGA manufacturers themselves are getting onboard with these high level tools, which they sell for relatively low cost.’
FPGAs firstly speed up any image processing by relieving the CPU of certain tasks. However, quite often, the end result is not always that companies want to process data faster, but rather they need a lot of information from the images within the same timeframe. ‘You can create different sets of data from the images captured,’ explains Kohli. ‘Initially, FPGAs just took care of data reduction, but now, in our experience, FPGAs carry out data augmentation as well. Customers want reduced data sets, but multiples of that data. All of a sudden, the total throughput of the system is several times bigger than the incoming data.’
Of course, FPGAs and GPUs are normally only needed when pushing the boundaries in terms of performance. Williamson notes that Stemmer Imaging has had customers in the past that have offloaded some of the processing to a GPU and six months later the next generation of CPU processors was fast enough for their system negating the need to do this. ‘If you’re really pushing the boundaries you need to take advantage of FPGAs and GPUs, but for standard processing applications it’s less used,’ he says, although adding that with cameras getting faster, the processing challenges are also growing.
Dealing with higher core counts
An FPGA is one option for improving processing time, but multicore processors are becoming more common – Intel’s new Xeon Phi coprocessor, based on the company’s Many Integrated Cores (MIC) technology, offers more than 50 cores, although this is designed more for supercomputers running huge simulations rather than the systems typically used for machine vision. Nevertheless, the technology is out there, and many image processing libraries, including Halcon and the Matrox Imaging Library (MIL), now support multicore processors for high-throughput applications.
‘There are two opposing strategies to take advantage of multicore architectures,’ explains Arnaud Lina, team manager for processing and analysis tools at Matrox Imaging. ‘Either relying on the imaging library to split the processing of an image onto multiple cores or the user creates their own multi-threaded application so each thread processes an image on each core. Both architectures are valid and both have advantages and pitfalls.’
Relying on the library to execute one image processing function and take advantage of all the cores, reduces the latency, Lina explains, which is good, but at the same time the frame rate might be reduced because the functions might not fit well on a multicore architecture. Lina cites one possible example: ‘Running a highly I/O bound operation such as an addition over eight cores will not result in the function scaling by eight – it may scale by a factor of three only. This is still faster than if it was run on a single core and it also reduces the latency.’ However, he warns the frame rate might not be as high as if the user creates a multi-threaded application and dispatches images separately for each core.
It depends on the application as to which strategy is used. ‘Even if a library automatically and transparently makes use of multicore architectures, customers have to realise that this won’t solve everything. If frame rate is the most important parameter then building a multi-threaded application is the most powerful way to make use of multicore processors,’ Lina states.
Pierantonio Boriero, product line manager at Matrox Imaging, sums up: ‘If you have four cores you don’t automatically get four times the performance simply by running an application on a multicore machine, even if it’s based on a library optimised for multicore systems.’ He adds that as the core count increases, the discussion surrounding which is the best strategy to take advantage of this becomes even more relevant. One avenue of research at Matrox Imaging is how best to make use of the growing number of cores from an image processing standpoint.
Imaging with a GPU
While not all applications will require optimisation of image processing through FPGAs or multicore processors, with higher resolution sensors there’s always a requirement for higher performance. The other option for increasing the speed of processing is using a GPU, although, to date, according to Kohli at Teledyne Dalsa, GPUs haven’t delivered the results expected thus far. ‘Shuttling data back and forth from the acquisition device to GPU creates more overhead than benefits,’ he says. ‘GPUs are great in applications involving pure number crunching, but using them in a real-time environment where you have to worry about movement of data, not just processing, and control of the data, you realise there’s no comparison to carrying out the processing in an FPGA. With an FPGA, you have full control and complete freedom of re-routing the data as you see fit.
‘Development tools for software engineers will become more accessible, easy to use and more integrated – that’s what Teledyne Dalsa is focused on,’ comments Kohli. ‘These software tools for FPGAs or multicore CPUs are allowing customers to run a lot of different types of application, either running legacy applications faster or run new applications like 3D and combine them with the flexibility of their own algorithms.’