Abstract
Active object recognition provides a mechanism for selecting informative viewpoints to complete
recognition tasks as quickly and accurately as possible. One can manipulate the position of the camera
or the object of interest to obtain more useful information. This approach can improve the computational
efficiency of the recognition task by only processing viewpoints selected based on the amount
of relevant information they contain. Active object recognition methods are based around how to select
the next best viewpoint and the integration of the extracted information. Most active recognition
methods do not use local interest points which have been shown to work well in other recognition
tasks and are tested on images containing a single object with no occlusions or clutter.
In this thesis we investigate using local interest points (SIFT) in probabilistic and non-probabilistic
settings for active single and multiple object and viewpoint/pose recognition. Test images used contain
objects that are occluded and occur in significant clutter. Visually similar objects are also included
in our dataset. Initially we introduce a non-probabilistic 3D active object recognition system
which consists of a mechanism for selecting the next best viewpoint and an integration strategy to
provide feedback to the system. A novel approach to weighting the uniqueness of features extracted
is presented, using a vocabulary tree data structure. This process is then used to determine the next
best viewpoint by selecting the one with the highest number of unique features. A Bayesian framework
uses the modified statistics from the vocabulary structure to update the system’s confidence in
the identity of the object. New test images are only captured when the belief hypothesis is below
a predefined threshold. This vocabulary tree method is tested against randomly selecting the next
viewpoint and a state-of-the-art active object recognition method by Kootstra et al. [1]. Our approach
outperforms both methods by correctly recognizing more objects with less computational expense.
This vocabulary tree method is extended for use in a probabilistic setting to improve the object recognition
accuracy. We introduce Bayesian approaches for object recognition and object and pose recognition.
Three likelihood models are introduced which incorporate various parameters and levels of
complexity. The occlusion model, which includes geometric information and variables that cater
for the background distribution and occlusion, correctly recognizes all objects on our challenging
database. This probabilistic approach is further extended for recognizing multiple objects and poses
in a test images. We show through experiments that this model can recognize multiple objects which
occur in close proximity to distractor objects. Our viewpoint selection strategy is also extended to the
multiple object application and performs well when compared to randomly selecting the next viewpoint,
the activation model [1] and mutual information. We also study the impact of using active
vision for shape recognition. Fourier descriptors are used as input to our shape recognition system
with mutual information as the active vision component. We build multinomial and Gaussian distributions
using this information, which correctly recognizes a sequence of objects.
We demonstrate the effectiveness of active vision in object recognition systems. We show that even
in different recognition applications using different low level inputs, incorporating active vision improves
the overall accuracy and decreases the computational expense of object recognition systems.
Govender, N (2021). Active Object Recognition For 2d And 3d Applications. Afribary. Retrieved from https://track.afribary.com/works/active-object-recognition-for-2d-and-3d-applications
Govender, Natasha "Active Object Recognition For 2d And 3d Applications" Afribary. Afribary, 15 May. 2021, https://track.afribary.com/works/active-object-recognition-for-2d-and-3d-applications. Accessed 27 Nov. 2024.
Govender, Natasha . "Active Object Recognition For 2d And 3d Applications". Afribary, Afribary, 15 May. 2021. Web. 27 Nov. 2024. < https://track.afribary.com/works/active-object-recognition-for-2d-and-3d-applications >.
Govender, Natasha . "Active Object Recognition For 2d And 3d Applications" Afribary (2021). Accessed November 27, 2024. https://track.afribary.com/works/active-object-recognition-for-2d-and-3d-applications