You have a device in your pocket that can answer lots of questions. Want weather info? Need a ride downtown? Can’t remember the exact rate at which glaciers melt? A search takes just a moment.
But can it really answer any question? How about: “What’s that castle in front of me?” Or: “Can you translate the whole restaurant menu?”
So, today we are tackling the subject of Computer Vision.
What Is Computer Vision?
Computer Vision is a branch of computer science that aims to replicate, at least in part, the way human vision works. Computer Vision allows computers to identify the objects found in images, videos, and real-life, as well as extract information from them.
Technicians pioneered this technology around 50 years ago. In 1966, Seymour Papert and Marvin Minsky started a project at the MIT Artificial Intelligence group. They called it the “Summer Vision Project” and hoped to build a system able to distinguish objects from their background in various scenes.
The project turned out to be trickier than they originally thought. Even now, we are far from perfecting the technology, despite the amounts of data applied and AI’s helping hand.
The process of Computer Vision partly replicates the human vision
How Does Computer Vision Work?
Computer Vision solutions have a lot in common with pattern recognition. To train the system, technicians expose it to images with pre-labeled objects, which the system learns.
For instance, you might need the computer to be able to recognize different foods — let’s say a tomato. You feed the images to the computer and instruct it to apply algorithms to analyze colors, shapes, distance, and depth in those images.
The computer identifies the parameters of a certain object and “remembers” it. Then the system is able to recognize tomatoes among a collection of unlabeled images.
Example of Computer Vision recognizing different objects
How Does Machine Learning Enhance Computer Vision?
In the beginning, Computer Vision had very limited functionality and required a lot of manual coding and human assistance. The technicians had to manually select the images and highlight the data points, such as distance, height, volume, etc.
With so much manual work involved, the margin of error was large: if something about the object’s parameters was a little off, the system couldn’t recognize it. For instance, it could fail to identify the same object shot at a different angle.
Machine Learning changed the situation for the better. It meant developers didn’t have to code that many rules; they created little apps that were able to detect patterns. From there, the system was able to learn on its own.
Deep Learning pushed progress even further. Deep Learning relies on neural networks, which can extract patterns from the proper examples. All you need to do is to select the images and label the objects — the neural network will extract patterns and transform those into a mathematical equation to classify the data.
Machine learning is applied to train the Computer Vision more efficiently
How Can Consumers Benefit From Computer Vision?
Computer Vision systems are already being used in various ways that benefit both the business and the customer. Here are some examples.
Amazon Go
Amazon opened its first cashier-less store in Seattle, Washington, in 2016. It managed to streamline the shopping experience: you go in, take the goods you need, and walk out.
The interesting part happens inside the store. As the customer enters, s/he is constantly monitored. Various technologies are applied to each phase of a shopping process:
- Entering the store — QR code. To get into the Amazon GO store, the customer needs to have the Amazon Go application installed on their own device. Upon entering, the client scans the QR code on the turnstile and gets in.
- In the store — Computer Vision. When the customer picks an item from the shelf, Amazon uses Computer Vision algorithms to identify the item. With Deep Learning, the system gets smarter, making it hard to steal anything.
- Leaving the store — RFID. Having chosen everything s/he wants to buy, the customer can leave the store through the turnstile. Amazon has RFID tags installed there, so when the client passes those, the charge passes through the customer’s Amazon account.
Computer Vision detects everything you take or put back on the shelf, and the system updates your virtual cart. Source: Amazon
Facial Recognition
Facial Recognition is taking over from fingerprint scanners as a means of authentication. Many tech companies, such as Samsung, Apple, and Google, have implemented facial recognition into their smartphones.
When setting up facial recognition, the user scans his face at various angles with the smartphone. One setup is enough for the technology to work pretty well.
Users can rely on facial recognition to open their smartphones, gain access to banking apps, and confirm purchases.
Also, the same technology is used in surveillance systems to ensure security, and in education and retail applications.
Apple’s Face ID does not just identifies the “photo” of your face. It can also identify face depth, hence it can’t be tricked with a picture of a face
Google Lens
Google has been messing around with object recognition for some time now. The first consumer product to use computer vision was Google Goggles, which launched in 2009.
The application, which has since been discontinued, allowed for searches based on images. The user would launch the app and point the smartphone camera at a certain object. The application captured the image and compared it to other images in Google’s database.
To enhance image recognition, the application also sent location data to the servers. This helped, for example, with identifying tourist attractions.
Google Goggles has brought basic image recognition capabilities to the consumers
Google Lens expands the functionality of its predecessor. Google Lens was released in 2017. Apart from identifying buildings and brands, it can now find and translate text from any language, scan QR and barcodes, and find products and foods.
Google used deep learning to enhance the app’s detection capabilities. And that’s just the beginning. During Google I/O 2019, the developers announced new features. The app will now be able to recommend certain dishes from a menu, calculate tips and split bills.
Google lens uses AI to perfect its Computer Vision
How Does Computer Vision Streamline Production?
Computer Vision architecture provides a helping hand in production as well. Here’s how manufacturers use it:
Predictive Maintenance
Equipment repair is a costly factor for manufacturers: for instance, one minute of downtime due to an equipment breakdown in the automotive industry can cost around $20,000.
A company called Fanuc has a trick up its sleeve. To decrease downtime as much as possible, the company installs Computer Vision cameras on its robots.
While the robots assemble vehicles, the cameras detect potential flaws in construction and send the data to the cloud for analysis. Based on that data, the manufacturer can apply predictive maintenance patterns.
Want to build a Computer Vision software solution?
Our skilled professionals are ready to get you a top-notch software.
Anna Halias
Business Development Manager,
HQSoftware
Product Assembly
Machines can make mistakes throughout the production process: put the product not in its box, fail to screw parts tightly enough, and so on. Computer Vision helps inspect the condition of the product and make sure that it is assembled and packaged correctly.
Computer Vision helps to look over the production process from start to finish
Defect Reduction
Apart from finding potential problems that can be fixed, manufacturers are also looking to reduce unfixable production faults. A company called SheltonVision created a WebSPECTOR system to quickly find defects.
The system relies on Computer Vision to inspect the production line and find flawed items. The system then categorizes defects into different types. This helps operators decide if the production line needs to be stopped to fix the defect.
It’s Easy to See the Value of Computer Vision
Computer Vision opens new ways for users to communicate with technology. Originally, the system was hard to configure and train, but the implementation of Machine Learning made it very easy and approachable.
Now many companies have introduced Computer Vision to their workflows and products. The technology helps individual users easily log into their devices, confirm transactions, shop seamlessly, and look up on the web everything they see.
As for the industry, Computer Vision allows manufacturers to better monitor the production process, apply predictive maintenance models, and discover defects.
HQSoftware Founder
Having founded the company in 2001, uses his broad knowledge to drive the company forward. Ready to share his wisdom on software development and technology insights
Related Posts
View All
We are open to seeing your business needs and determining the best solution. Complete this form, and receive a free personalized proposal from your dedicated manager.
Sergei Vardomatski
Founder