Learning Objects Model and Context for Recognition and Localisation

Guido Manfredi

Résumé

This Thesis addresses the modeling, recognition, localisation and use of context for objects manipulation by a robot. We start by presenting the modeling process components: the real system, the sensors' data, the properties to reproduce and the model. By specifying them, one denes a modeling process adapted to the problem at hand, namely object manipulation by a robot. This analysis leads us to the adoption of local textured descriptors for object modeling. Modeling with local textured descriptors is not a new concept, it is the subject of many Structure from Motion (SfM) or Simultaneous Localisation and Mapping (SLAM) works. Existing methods include bundler, roboearth modeler and 123DCatch. Still, no methods has gained widespread adoption. By implementing a similar approach, we show that their are hard to use even for expert users and produce highly complex models. Such complex techniques are necessary to guaranty the robustness of the model to view point change. There are two ways to handle the problem: the multiple views paradigm and the robust features paradigm. The multiple views paradigm advocate in favour of using a large number of views of the object. The robust feature paradigm rely on robust features able to resist large view point changes. We present a set of experiments to provide an insight into the right balance between both. By varying the number of views and using dierent features we show that small and fast models can provide robustness to view point changes up to bounded blind spots which can be handled by robotic means. We propose four dierent methods to build simple models from images only, with as few a priori information as possible. The rst one applies to piecewise planar objects and rely on homographies for localisation. The second approach is applicable to objects with simple geometry but requires many measures on the object. The third method requires the use of a calibrated 3D sensor but no additional information. The fourth technique doesn't need a priori information at all. We use this last method to model object from images automatically retrieved from a grocery store website. Even using light models, real situations ask for numerous object models to be stored and processed. This poses the problems of complexity, processing multiple models quickly, and ambiguity, distinguishing similar objects. We propose to solve both problems by using contextual information. Contextual information is any information helping the recognition which is not directly provided by sensors. We focus on two contextual cues: the place and the surrounding objects. Some objects are mainly found in some particular places. By knowing the current place, one can restrict the number of possible identities for a given object. We propose a method to autonomously explore a previously labeled environment and establish a correspondence between objects and places. Then this information can be used in a cascade combining simple visual descriptors and context. This experiment show that, for some objects, recognition can be achieved with as few as two simple features and the location as context. This Thesis stresses the good match between robotics, context and objects recognition.

Pour interagir avec le monde, un robot doit connaître, reconnaître et localiser dans l'espace les objets qui l'entourent. Pour connaître un objet, le robot doit le modéliser. Les techniques actuelles sont lourdes, complexes pour l'utilisateur et requièrent la présence de l'objet. Concernant la reconnaissance, de nombreuses techniques existent, mais plus le nombre d'objets considéré augmente, moins elles sont ables. Enn, la localisation repose sur les deux étapes précédentes et échouera si elles manquent de précision. La première partie de cette thèse montre que des modèles d'objets légers sont susants et qu'ils peuvent être modélisés automatiquement depuis Internet. La deuxième partie, démontre que l'utilisation du contexte (lieu, heure, etc.) permet d'améliorer la reconnaissance. Ce travaille se conclue en mettant l'accent sur l'avantage des modèles simples et en insistant sur l'importance des informations contextuelles.

Learning Objects Model and Context for Recognition and Localisation

Modèle d'objets d'apprentissage et contexte de reconnaissance et de localisation

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager