Computer Vision Object Recognition System

1175 Words5 Pages

We, humans, perceive the three-dimensional structure of the world easily. Perceptual psychol- ogists have spent many decades to understand the visual system of humans and mammalians. The visual cortex is responsible for processing the visual information coming from the retina. They even developed optical illusions that trick the visual cortex.
Computer Vision is a discipline of Computer Science that studies methods for artificial systems for extraction of meaningful information from images. It studies all the steps starting from acquiring, precessing, analyzing, and ending to understanding images. In general images are high dimensional data from the real world. Computer vision aims at producing numerical and symbolic information that machines …show more content…

2012 [1]. The authors use a computer generated animation movie in order to generate ground truths for optical flow, depth and segementation of objects. This is used later for Computer Vision systems evaluation.
1.1.3 What is Object Recognition
In the field of Computer Vision, object recognition is the task of finding and categorizing objects in an image or in a video. Humans are able to perform this task with almost no effort, the visual system being capable of detecting objects at different scales and sizes, and even when partially occluded. This task is still a challenge for computer vision systems. This has been ongoing research over a couple of decades.
"Classical" methods used SIFT + Fisher Vectors, Sparse Coding, HOG (Histogram of Oriented Gradients) etc. In Figure 1.1, you can see clear error rate improvement from 2012 - onwards. Top-5 error rate dropped by a large margin of 10 % in 2012, from 26 % to 16%.(AlexNet )[2]. The drop was astonioshing. You can also see the increase in the number of entries that use GPGPU and convolutional neural networks for …show more content…

INTRODUCTION 3 over 1 200 000 images
• Current advances in highly parallel computation, especially on General Purpose GPU (GPGPU). The progress in hardware allowed for more complex models to be trained and stored in memory.
• Development of efficient ways of training convolutional neural networks using GPU computing. After the 2005 paper that outlined the importance of GPGPU for ma- chine learning,[3] several publications described more efficient ways to train convolutional neural networks using GPU computing. For instance, new, better model regularization strategies were developed, such as Dropout (Hinton et al., 2012) [4].
The task is divided in 3 subtasks:
• Image Classification. This task implies saying if an image contains an object, or not with some degree of certainty. The system doesn 't have to say where in the image the object resides. This task has been a challenge for many dacades, but with the advent of Convolutional Neural Networks, the error rate dropped dramatically to less than 5 % with Deep Residual Networks [5] on ILSVRC2015

Open Document