Skip to Main content Skip to Navigation

Learning Objects Model and Context for Recognition and Localisation

Guido Manfredi 1
1 LAAS-RAP - Équipe Robotique, Action et Perception
LAAS - Laboratoire d'analyse et d'architecture des systèmes
Abstract : This Thesis addresses the modeling, recognition, localisation and use of context for objects manipulation by a robot. We start by presenting the modeling process components: the real system, the sensors' data, the properties to reproduce and the model. By specifying them, one denes a modeling process adapted to the problem at hand, namely object manipulation by a robot. This analysis leads us to the adoption of local textured descriptors for object modeling. Modeling with local textured descriptors is not a new concept, it is the subject of many Structure from Motion (SfM) or Simultaneous Localisation and Mapping (SLAM) works. Existing methods include bundler, roboearth modeler and 123DCatch. Still, no methods has gained widespread adoption. By implementing a similar approach, we show that their are hard to use even for expert users and produce highly complex models. Such complex techniques are necessary to guaranty the robustness of the model to view point change. There are two ways to handle the problem: the multiple views paradigm and the robust features paradigm. The multiple views paradigm advocate in favour of using a large number of views of the object. The robust feature paradigm rely on robust features able to resist large view point changes. We present a set of experiments to provide an insight into the right balance between both. By varying the number of views and using dierent features we show that small and fast models can provide robustness to view point changes up to bounded blind spots which can be handled by robotic means. We propose four dierent methods to build simple models from images only, with as few a priori information as possible. The rst one applies to piecewise planar objects and rely on homographies for localisation. The second approach is applicable to objects with simple geometry but requires many measures on the object. The third method requires the use of a calibrated 3D sensor but no additional information. The fourth technique doesn't need a priori information at all. We use this last method to model object from images automatically retrieved from a grocery store website. Even using light models, real situations ask for numerous object models to be stored and processed. This poses the problems of complexity, processing multiple models quickly, and ambiguity, distinguishing similar objects. We propose to solve both problems by using contextual information. Contextual information is any information helping the recognition which is not directly provided by sensors. We focus on two contextual cues: the place and the surrounding objects. Some objects are mainly found in some particular places. By knowing the current place, one can restrict the number of possible identities for a given object. We propose a method to autonomously explore a previously labeled environment and establish a correspondence between objects and places. Then this information can be used in a cascade combining simple visual descriptors and context. This experiment show that, for some objects, recognition can be achieved with as few as two simple features and the location as context. This Thesis stresses the good match between robotics, context and objects recognition.
Document type :
Complete list of metadata

Cited literature [143 references]  Display  Hide  Download
Contributor : Christine Fourcade <>
Submitted on : Thursday, October 26, 2017 - 9:39:26 AM
Last modification on : Thursday, June 10, 2021 - 3:06:57 AM
Long-term archiving on: : Saturday, January 27, 2018 - 12:59:51 PM


Files produced by the author(s)


  • HAL Id : tel-01624066, version 1


Guido Manfredi. Learning Objects Model and Context for Recognition and Localisation. Automatic. Université Paul Sabatier - Toulouse III, 2015. English. ⟨tel-01624066⟩



Record views


Files downloads