Thanks for visiting Imaging and Machine Vision Europe.

You're trying to access an editorial feature that is only available to logged in, registered users of Imaging and Machine Vision Europe. Registering is completely free, so why not sign up with us?

By registering, as well as being able to browse all content on the site without further interruption, you'll also have the option to receive our magazine (multiple times a year) and our email newsletters.

Image datasets released by Google to speed machine learning

Share this on social media:

Google’s Open Images and YouTube8-M have released datasets of annotated images and video to aid researchers develop new image analysis techniques.

The tagged Open Images dataset has 9 million entries, while YouTube8-M’s database contains 8 million videos with 50,000 hours of footage.

The datasets have been made available to further the development of machine learning algorithms, a technique whereby a machine can learn to recognise content in images based on tagged data previously supplied to it. Machine learning potentially offers more accurate image analysis software, but requires large volumes of data to do so.

Halcon 13, the latest version of MVTec’s machine vision library, uses deep learning algorithms, notably for OCR which, according to the company speaking at the Vision show in Stuttgart in November, gives a read rate two times faster than earlier OCR tools, but requires around 50 million images to cover all possible characters.

Most industrial vision software companies are experimenting with machine learning techniques in one way or another. Swiss company Vidi Systems is one early industrial image analysis software library that uses deep learning algorithms – it was shortlisted for the Vision Award at Vision 2016.

The Open Images and YouTube8-M datasets could also be useful for engineers developing embedded vision solutions.

Related articles:

Image processing reaches new depths - Facebook, Amazon and Google are all working on high-profile deep learning projects, from speech pattern recognition to building driverless cars. Rob Ashwell looks at how the technology is being deployed in the machine vision sector to improve and speed inspection

Recent News

04 October 2019

Each pixel in Prophesee’s Metavision sensor only activates if it detects a change in the scene – an event – which means low power, latency and data processing requirements

18 September 2019

3D sensing company, Outsight, has introduced a 3D semantic camera that combines lidar ranging with hyperspectral material analysis. The camera was introduced at the Autosens conference in Brussels

16 September 2019

OmniVision Technologies will be showing an automotive camera module at the AutoSens conference in Brussels from 17 to 19 September, built using OmniVision’s OX03A1Y image sensor with an Arm Mali-C71 image signal processor

09 September 2019

Hamamatsu Photonics claims it is the first company to mass produce a mid-infrared detector that doesn’t use mercury and cadmium, which are restricted under the European Commission’s RoHS directive