1991

Direct Transfer of Learned Information Among Neural Networks

Lorien Y. Pratt, Jack Mostow, Candace A. Kamm

citations

Cite Score

15

AI summary

This paper investigates the direct transfer of learned information, encoded as weights, between neural networks using the back-propagation algorithm to improve learning speed, achieving speedups of up to an order of magnitude on speech recognition tasks by transferring weights from smaller subnetworks.

Main Contributions

  • Investigated direct transfer of information encoded as weights between neural networks for related tasks.
  • Exploratory study on pre-setting network weights and its effects on subsequent learning speed, finding that pre-set weights with high magnitudes lead to faster learning.
  • Achieved speedups of up to an order of magnitude in back-propagation learning for speech recognition tasks by transferring weights from smaller networks trained on subtasks.
  • Demonstrated transfer from multiple source networks to different portions of a target network using a problem decomposition technique.
  • Provided insights into the dynamics of back-propagation learning with pre-set weights, including the importance of weight magnitudes for retaining hyperplane positions.

Abstract

A touted advantage of symbolic representations is the ease of transferring learned information from one intelligent agent to another. This paper investigates an analogous problem: how to use information from one neural network to help a second network learn a related task. Rather than translate such information into symbolic form (in which it may not be readily expressible), we investigate the direct transfer of information encoded as weights. Here, we focus on how transfer can be used to address the important problem of improving neural network learning speed. First we present an exploratory study of the somewhat surprising effects of pre-setting network weights on subsequent learning. Guided by hypotheses from this study, we sped up back-propagation learning for two speech recognition tasks. By transferring weights from smaller networks trained on subtasks, we achieved speedups of up to an order of magnitude compared with training starting with random weights, even taking into account the time to train the smaller networks. We include results on how transfer scales to a large phoneme recognition problem.

Citation Graph

Loading graph...

References [11]

Sort:
Filter:

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

46 papers in library cite

E. Barnard, R. A. Cole - 1989

1 paper in library cites

W. M. Fisher, V. Zue, J. Bernstein, D. Pallett - 1987

1 paper in library cites

A. J. Robinson - 1989

1 paper in library cites

Candace A. Kamm, S. Singhal - 1990

1 paper in library cites

Lorien Y. Pratt, Candace A. Kamm - 1991

1 paper in library cites

A. Waibel, H. Sawai, K. Shikano - 1989

1 paper in library cites

A. S. Weigend, B. A. Huberman, D. E. Rumelhart - 1990

1 paper in library cites

L. M. Fu - 1990

1 paper in library cites

G. G. Towell, J. W. Shavlik, M. O. Noordewier - 1990

1 paper in library cites

J. W. Shavlik, R. J. Mooney, G. G. Towell - 1991

1 paper in library cites

Cited by

3

papers in your library

Cites

1

papers in your library

Read

on January 23, 2026

Your review

Tags

Paper Aliases

No aliases