A beginners guide to AI: computer vision and image recognition
14 Januari 2020 | 4min read
Teaching a computer how to ‘see’ is not easy. You can put a camera on a PC, but that won’t give it sight. In order for a machine to view the world like people and animals do, it relies on computer vision and image recognition.
Computer vision is the reason your iPhone Face ID can tell whether its camera is looking at your face or not. It powers a barcode scanner’s ability to ‘see’ a bunch of stripes. Whenever a machine processes raw visual input, it’s using computer vision to understand what it’s seeing.
You will read about the next topics in this blog:
- Image recognition vs computer vision
- Computer vision: the hotdog application
- Image recognition at work
Image recognition vs computer vision
Image recognition and computer vision are terms that are often used synonymously.
A short and easy explanation:
Computer vision is like the part of the human brain that processes the information received by the eyes – not the eyes themselves.
Image recognition gives a computer the ability to interpret the input receive through computer vision and categorize what it ‘sees.’
Computer vision: the hotdog application
There exists an app that uses your smartphone camera to determine whether an object is a hotdog or not. It’s called Not Hotdog. It may not seem that impressive, after all a small child can tell you whether something is a hotdog or not. The process of training a neural network to perform image recognition is quite complex, both in the human brain and in computers.
How is this algorithm possible?
People have fed an AI engine with thousands of pictures of hotdogs. After that, the AI develops a general idea of what a picture of a hotdog should have in it. When you show an image of something through the app, it compares every pixel of that image to every picture of a hotdog it’s ever seen.
Image recognition at work
Image and face recognition on social networks
Facebook has been using facial recognition for tagging people on users’ photos. Whenever users upload a photo, Facebook is able to recognize objects and scenes in it before people enter a description. The computer vision can distinguish object, facial expressions, food and more. Besides tagging of people of photos, image recognition is used to translate visual content for blind users and to identify inappropriate or offensive images.
Visual search to find better products (e-commerce)
Visual search allows users to search for similar images or products using a reference image they took with their camera or downloaded from the internet. Fashion, home décor and furniture e-commerce websites are already integrating this in their digital shopping experience to increase conversions.
Google reverse image search
Imagine you’re looking for an image that shared similarities with one you already have. The perfect tool for that would be Google Reverse Image search. It allows you to upload an image and perform a search with it. If this image exists on the internet, Google can help you.
Real estate websites
Real estate websites contain a lot of images that have important but unutilized information about the listings (properties). The AI engines from Co-libry can recognize rooms, detect objects in rooms and recognizes the state of the room (renovated, new or bad state)
This results in more features processable by your search engine and a better user experience for the website visitor. Find more information about image recognition here.
Any AI system that processes visual information usually relies on computer vision. The computers that are capable of identifying specific objects or categorizing images based on their content are performing image recognition.
AI, at this point, is much like a small child. Computer vision gives it the sense of sight, but that doesn’t come with an understanding of the physical universe.