Conferences >2018 IEEE International Confe...

Transferring Information Between Neural Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper investigates techniques to transfer information between deep neural networks. We demonstrate that a student network, which has access to information computed b...Show More

Metadata

Abstract:

This paper investigates techniques to transfer information between deep neural networks. We demonstrate that a student network, which has access to information computed by a teacher network on the training data, learns faster, can be less deep and requires less labeled examples to achieve a given performance level. For that we force the student to mimic the teacher by adding a penalty term to the student's objective. We evaluate different penalty terms: (1) mean squared error between the cost gradients, (2) the Jacobian of the pre-softmax layer, (3) its row-summed version, (4) the cost gradient differences to standard double backpropagation and (5) a targeted double backpropagation via gradient derived masks. The Jacobian method improves the accuracy proportional to the difference in training examples, in contrast to the cost gradient. If the difference in accuracy between teacher and student is large enough, we find an improvement from the Jacobian information, even if both had seen the same training data. This indicates that information transfer has a regularization effect.

Published in: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 15-20 April 2018

Date Added to IEEE Xplore: 13 September 2018

ISBN Information:

Electronic ISSN: 2379-190X

DOI: 10.1109/ICASSP.2018.8461511

Conference Location: Calgary, AB, Canada

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

IEEE Keywords
- Jacobian matrices ,
- Training ,
- Computer architecture ,
- Sensitivity ,
- Standards ,
- Neural networks ,
- Training data
Index Terms
Author Keywords

Contents

1. Introduction

Due to increased computational capacities, availability of open source datasets and advancements in theoretical research, Deep Neural Networks (DNNs) currently achieve excellent performance in a wide range of applications, e.g., image classification [8] and quality assessment [3], natural language processing [4], genomics [16] or strategic game playing [19]. Though they perform well on their respective measures, DNNs suffer from a high computation cost during inference, as architectures may contain billions of trainable parameters [5], and from interpretability issues. This limits their usability on certain tasks, for example offline speech recognition on a mobile device, or transcriptomics, where one would like to know, which DNA motif led the protein to bind.

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.

IEEE Keywords
- Jacobian matrices ,
- Training ,
- Computer architecture ,
- Sensitivity ,
- Standards ,
- Neural networks ,
- Training data
Index Terms
Author Keywords

References is not available for this document.

Transferring Information Between Neural Networks

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Transferring Information Between Neural Networks

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?