Conferences >2017 IEEE Conference on Compu...

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this work we train in an end-to-end manner a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture....Show More

Metadata

Abstract:

In this work we train in an end-to-end manner a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture. Such a network can act like a swiss knife for vision tasks, we call it an UberNet to indicate its overarching nature. The main contribution of this work consists in handling challenges that emerge when scaling up to many tasks. We introduce techniques that facilitate (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. This allows us to train in an end-to-end manner a unified CNN architecture that jointly handles (a) boundary detection (b) normal estimation (c) saliency estimation (d) semantic segmentation (e) human part segmentation (f) semantic boundary detection, (g) region proposal generation and object detection. We obtain competitive performance while jointly addressing all tasks in 0.7 seconds on a GPU. Our system will be made publicly available.

Published in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Date of Conference: 21-26 July 2017

Date Added to IEEE Xplore: 09 November 2017

ISBN Information:

Print ISSN: 1063-6919

DOI: 10.1109/CVPR.2017.579

Conference Location: Honolulu, HI, USA

Contents

1. Introduction

Computer vision involves a host of tasks, such as boundary detection, semantic segmentation, surface estimation, object detection, image classification, to name a few. While Convolutional Neural Networks (CNNs) [32] have been shown to be successful at effectively handling most vision tasks, in the current literature most works focus on individual tasks and devote all of a CNN's power to maximizing task-specific performance. In our understanding a joint treatment of multiple problems can result not only in simpler and faster models, but will also be a catalyst for reaching out to other fields. One can expect that such all-in-one, “swiss knife” architectures will become indispensable for general AI, involving, for instance, robots that will be able to recognize the scene they are in, identify objects, navigate towards them, and manipulate them.

References is not available for this document.

MIT Libraries

MIT Libraries

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

References