Video games are a thriving industry, and have overtaken movies, films and television to become the dominant entertainment product. James Cameron’s Avatar broke box office records, taking $75m in its first weekend, but Call of Duty: Black Ops took $360m in a single day, and this year’s installment of the franchise looks likely to break the record again. Imaging and machine vision suppliers are accessing this market in a number of ways, both by working with developers directly and by using technologies and techniques from the gaming industry in more traditional machine vision applications.
Avatar’s success at the cinema was boosted in no small part by the novelty of its being in 3D. The film’s production relied on motion capture and rendering techniques developed primarily for video games producers; actors perform in front of a green screen with cameras tracking key points on their bodies, allowing the scene to be rendered around them in 3D. Stéphane Clauss, business development manager for Europe at Sony explains that the company’s machine vision cameras have been used for many years in motion capture studios. ‘Colour sensitivity is not so important in this application, because we’re usually just tracking some markers placed on the actors,’ he says. ‘Most of the time it’s in black and white, perhaps using near-IR. In some applications, synching the cameras to the lighting is also really important in order to ensure perfect sensitivity, a perfect signal-to-noise ratio, and perfect exposure. Finally, a GigE Vision interface is another requirement for motion capture, as cable in a large studio could run for up to 100m.’
According to Clauss, extraction of the point cloud data produced by motion capture becomes easy when the resolution and data quality are high. Until now, however, consumer products have not been up to the task, and so motion capture has remained in the hands of games developers, rather than gamers. Sony’s recently-launched Playstation Move has given gamers a taste of motion capture, however – players hold coloured spheres which are tracked by a camera mounted next to the TV. With the addition of accelerometer data from the controllers, the PS3 console is able to interpret the movements of the coloured spheres, allowing gamers to control their games with intuitive motions.
From living room to production line
Sony’s Playstation Move is an example of machine vision technology making its way into gaming, but Microsoft’s Xbox Kinect is an example of gaming technology finding applications in machine vision. Like the Move, the Kinect is an input device for gaming, but rather than a 2D camera with motion capture, the Kinect interprets a gamer’s movements via a 3D imaging system and onboard processing capacity. The device is the size of a shoe, costs around £100 and contains two cameras and an IR light source.
Being an inexpensive, mass-produced, and reasonably accurate 3D imaging system, the device has generated some interest in the machine vision space. Munich-based vision software company MVTec has responded to customer interest in the Kinect, and added compatibility with the device to its Halcon imaging software library. According to Markus Ulrich, manager of research and development at MVTec, the company’s first contact with the Kinect was via Microsoft’s advertising. ‘We saw that the sensor is able to deliver 3D data in a very fast way, and so we decided to buy one in order to run some tests with it.’
Ulrich’s R&D team evaluated the sensor’s machine vision potential by testing its performance in standard library functions: ‘We looked at how accurate the 3D data was, and at how we could use the sensor with our existing library functionalities.’ One such functionality is the company’s 3D surface-based matching algorithm for finding objects in a point cloud, and for determining the pose of the object within that cloud. Other functionalities, he says, include the segmentation of primitives, which are geometric bodies such as planes, cubes or cylinders. ‘We can acquire 3D data with the Kinect sensor and then automatically find 3D primitives within that data,’ he explains.
Gaming represents a huge market, but how can machine vision companies tap into this? Image courtesy of Barone Firenze
MVTec subsequently wrote an image acquisition interface able to take 3D data straight from the sensor and feed it into the Halcon machine vision software, Ulrich says. ‘The interface makes it very easy for the user to just grab the data they need, and it was very easy for us to write the interface.’
The Kinect sensor uses a technique known as ‘structured light’ to capture 3D data, in which a carefully-designed (and in this case proprietary) pattern of light is projected onto the scene. The system can deduce the distance to objects within the scene by analysing distortions in the patterned light, thereby adding depth information to form a 3D image. All data processing for the structured light depth measurement, as well as recognition of the shape of the gamer and his or her limbs, is done on-board the Kinect device, leaving the console’s processor free to handle the game itself.
Although Microsoft provides APIs and an SDK for the Kinect, commercial use laws required MVTec to use an open National Instruments framework to connect the sensor to the Halcon Library. Additionally, the team was able to improve the usefulness of the data from the sensor through careful calibration: ‘The Kinect sensor has an RGB sensor and a separate structured light CMOS sensor that sees in IR rather than visible,’ explains Ulrich. ‘In order to improve the accuracy of the off-the-shelf system, we calibrate the two sensors individually, and then calibrate the pose of the two relative to each other, which is particularly important if the application requires 3D data to be visualised in visible light.’ The process uses standard calibration modules included in MVTec’s Helicon software, he adds.
The only way to go
Although not yet active in the video games industry, Allied Vision Technology (AVT) has worked with companies integrating vision products into interactive advertising. Last year, an animated billboard in Times Square incorporated one of the company’s Prosilica GX1910 cameras. The effect was to allow a model in the advert to appear to interact with the crowd in the square, although the content itself was pre-produced. According to Jean-Philippe Roman, public relations manager at the company, the camera used had to be a machine vision camera, because it had to be synched with the software: ‘The camera had to do exactly what the software and the pre-produced animation needed it to do, and it had to be able to accept commands. This would not be possible with any commercial digital camera available.’
Advertising and gaming are not conventional applications for these technologies, but according to Roman, AVT recognises that the future for suppliers of imaging technologies lies in that direction. ‘What we call the traditional, historical machine vision industry can be assumed to be all about factory floor applications, but the biggest growth rates in machine vision right now are in non-industrial markets,’ he says. ‘I wouldn’t say that industrial inspection is a saturated market, as vision inspection is still being discovered by many new markets as a route to benefits in production, but nonetheless, the growth of our industry very much depends on these new markets.’
Sony’s Stéphane Clauss agrees: ‘This is the future, for sure. We have various ideas as to where the biggest growth in the machine vision market will be, but entertainment is clearly a big part of it.
‘There’s buzz around the Kinect’s SDK,’ he adds. ‘Anyone can now use the device, but in terms of the machine vision market, even in non-manufacturing applications, we’re looking for reliability, robustness and precision… and we’re not yet sure we can expect that from consumer products.’
Perhaps more playing time will show where this game ends.