Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images | IEEE Conference Publication | IEEE Xplore

Scheduled Maintenance on Monday 1/13/2025

Single article sales and account management will be unavailable from 5:00 AM - 7:00 PM ET (09:00 - 23:00 UTC). We apologize for the inconvenience.

  • IEEE.org
  • IEEE Xplore
  • IEEE SA
  • IEEE Spectrum
  • More Sites
    • Donate
    • Personal Sign In
IEEE Xplore logo - Link to home
MIT Libraries logo
Access provided by:
MIT Libraries
Sign Out
IEEE logo - Link to IEEE main site homepage
MIT Libraries logo
Access provided by:
MIT Libraries
Sign Out
ADVANCED SEARCH
Conferences >2019 International Conference...

Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images

PDF
Jean-Philippe Mercier; Chaitanya Mitash; Philippe Giguère; Abdeslam Boularias
All Authors
  • Alerts

    Alerts

    Manage Content Alerts
    Add to Citation Alerts

Abstract

Document Sections

  • I.
    Introduction
  • II.
    Related Works
  • III.
    Proposed Approach
  • IV.
    Weakly Supervised Learning Experiments For Object Detection and Classification
  • V.
    6D Pose Estimation Experiments
Authors
Figures
References
Citations
Keywords
Metrics
More Like This
Footnotes
  • Download PDF
  • Download References
  • Request Permissions
  • Save to
  • Alerts

Abstract:

Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with mult...Show More

Metadata

Abstract:

Accurate pose estimation is often a requirement for robust robotic grasping and manipulation of objects placed in cluttered, tight environments, such as a shelf with multiple objects. When deep learning approaches are employed to perform this task, they typically require a large amount of training data. However, obtaining precise 6 degrees of freedom for ground-truth can be prohibitively expensive. This work therefore proposes an architecture and a training process to solve this issue. More precisely, we present a weak object detector that enables localizing objects and estimating their 6D poses in cluttered and occluded scenes. To minimize the human labor required for annotations, the proposed detector is trained with a combination of synthetic and a few weakly annotated real images (as little as 10 images per object), for which a human provides only a list of objects present in each image (no time-consuming annotations, such as bounding boxes, segmentation masks and object poses). To close the gap between real and synthetic images, we use multiple domain classifiers trained adversarially. During the inference phase, the resulting class-specific heatmaps of the weak detector are used to guide the search of 6D poses of objects. Our proposed approach is evaluated on several publicly available datasets for pose estimation. We also evaluated our model on classification and localization in unsupervised and semi-supervised settings. The results clearly indicate that this approach could provide an efficient way toward fully automating the training process of computer vision models used in robotics.
Published in: 2019 International Conference on Robotics and Automation (ICRA)
Date of Conference: 20-24 May 2019
Date Added to IEEE Xplore: 12 August 2019
ISBN Information:

ISSN Information:

DOI: 10.1109/ICRA.2019.8794112
Conference Location: Montreal, QC, Canada
Contents

I. Introduction

Robotic manipulators are increasingly deployed in challenging situations that include significant occlusion and clutter. Prime examples are warehouse automation and logistics, where such manipulators are tasked with picking up specific items from dense piles of a large variety of objects, as illustrated in Fig. 1. The difficult nature of this task was highlighted during the recent Amazon Robotics Challenges [1]. These robotic manipulation systems are generally endowed with a perception pipeline that starts with object recognition, followed by the object’s six degrees-of-freedom (6D) pose estimation. It is known to to be a computationally challenging problem, largely due to the combinatorial nature of the corresponding global search problem. A typical strategy for pose estimation methods [2]–[5] consists in generating a large number of candidate 6D poses for each object in the scene and refining hypotheses with the Iterative Closest Point (ICP) [6] method or its variants. The computational efficiency of this search problem is directly affected by the number of pose hypotheses. Reducing the number of candidate poses is thus an essential step towards real-time grasping of objects.

Overview of our approach for 6D pose estimation at inference time. This figure shows the pipeline for the drill object of the YCB-video dataset [7]. A deep learning model is trained with weakly annotated images. Extracted class-specific heatmaps, along with 3D models and the depth image, guide the Stochastic Congruent Sets (StoCS) method [8] to estimate 6D object poses. Further details of the network are available in Section III.

Contact IEEE to Subscribe
More Like This
Skill-Oriented and Performance-Driven Adaptive Curricula for Training in Robot-Assisted Surgery Using Simulators: A Feasibility Study

IEEE Transactions on Biomedical Engineering

Published: 2021

An Evaluation of Inanimate and Virtual Reality Training for Psychomotor Skill Development in Robot-Assisted Minimally Invasive Surgery

IEEE Transactions on Medical Robotics and Bionics

Published: 2020

Show More

References

References is not available for this document.

IEEE Personal Account

  • Change username/password

Purchase Details

  • Payment Options
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical interests

Need Help?

  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support

Follow

About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | IEEE Privacy Policy

A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

© Copyright 2025 IEEE - All rights reserved, including rights for text and data mining and training of artificial intelligence and similar technologies.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests

Need Help?

  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Contact Us
  • Help
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Sitemap
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2025 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Test Whats new message.

Learn More