1993

A 'Self-Referential' Weight Matrix

Jürgen Schmidhuber

citations

Cite Score

7

AI summary

This paper introduces a self-referential recurrent network that uses its own input and output units to observe its errors and modify its own weight matrix. The result is the first ‘introspective' neural net with explicit potential control over all of its own adaptive parameters.

Main Contributions

  • Introduces a 'self-referential' recurrent network that can 'speak' about its own weight matrix in terms of activations
  • Uses some of its input and output units for observing its own errors and for explicitly analyzing and modifying its own weight matrix
  • Introduces the first ‘introspective' neural net with explicit potential control over all of its own adaptive parameters
  • Presents an algorithm with high computational complexity per time step which is independent of the sequence length and equals O(nconnlognconn)
  • Shows that such algorithms are possible at all

Abstract

Weight modifications in traditional neural nets are computed by hard-wired algorithms. Without exception, all previous weight change algorithms have many specific limitations. Is it (in principle) possible to overcome limitations of hard-wired algorithms by allowing neural nets to run and improve their own weight change algorithms? This paper constructively demonstrates that the answer (in principle) is 'yes'. I derive an initial gradient-based sequence learning algorithm for a 'self-referential' recurrent network that can 'speak' about its own weight matrix in terms of activations. It uses some of its input and output units for observing its own errors and for explicitly analyzing and modifying its own weight matrix, including those parts of the weight matrix responsible for analyzing and modifying the weight matrix. The result is the first ‘introspective' neural net with explicit potential control over all of its own adaptive parameters. A disadvantage of the algorithm is its high computational complexity per time step which is independent of the sequence length and equals O(nconnlognconn), where nconn is the number of connections. Another disadvantage is the high number of local minima of the unusually complex error surface. The purpose of this paper, however, is not to come up with the most efficient ‘introspective’or‘self-referential' weight change algorithm, but to show that such algorithms are possible at all.

Citation Graph

Loading graph...

References [8]

Sort:
Filter:

R. Williams, David Zipser - 1989

8 papers in library cite

A. J. Robinson, F. Fallside - 1987

10 papers in library cite

Ronald J. Williams - 1989

6 papers in library cite

Jürgen Schmidhuber - 1992

4 papers in library cite

Jürgen Schmidhuber - 1993

1 paper in library cites

Jürgen Schmidhuber - 1993

1 paper in library cites

K. Moller, Sebastian Thrun - 1990

1 paper in library cites

Cited by

1

papers in your library

Cites

1

papers in your library

Read

on October 21, 2025

Your review

Tags

Paper Aliases

No aliases