Papperoni

1990

Probabilistic Interpretation of Feedforward Classification Network Outputs, With Relationship to Statistical Pattern Recognition

John S. Bridle

Open PDF Google Scholar

citations

Cite Score

AI summary

This paper presents a method to interpret feed-forward non-linear networks as probabilities, using softmax and radial units for Gaussian within-class distributions, achieving improved class discrimination through cross-class training.

Main Contributions

Introduces probability scoring as an alternative to squared error minimization.
Presents a normalized exponential (softmax) multi-input generalization of the logistic non-linearity.
Proposes the use of radial units before the softmax output stage to compute posterior distributions over class labels based on Gaussian within-class distributions.
Demonstrates that cross-class information during training improves class discrimination.
Applies the softmax non-linearity and probability scoring to construct a network for computing posterior distribution over class labels, under assumptions of Gaussian within-class distributions with equal covariance matrices.

Abstract

We are concerned with feed-forward non-linear networks (multi-layer perceptrons, or MLPs) with multiple outputs. We wish to treat the outputs of the network as probabilities of alternatives (e.g. pattern classes), conditioned on the inputs. We look for appropriate output non-linearities and for appropriate criteria for adaptation of the parameters of the network (e.g. weights). We explain two modifications: probability scoring, which is an alternative to squared error minimisation, and a normalised exponential (softmax) multi-input generalisation of the logistic non-linearity. The two modifications together result in quite simple arithmetic, and hardware implementation is not difficult either. The use of radial units (squared distance instead of dot product) immediately before the softmax output stage produces a network which computes posterior distributions over class labels based on an assumption of Gaussian within-class distributions. However the training, which uses cross-class information, can result in better performance at class discrimination than the usual within-class training method, unless the within-class distribution assumptions are actually correct.

Citation Graph

Loading graph...

References [12]

Sort:

Filter:

[1]Connectionist Learning Procedures

Geoffrey E. Hinton - 1987

11 papers in library cite

Google Scholar

It's a very good overview of everything that was happening in 1987! A bit too long though, but a good start nonetheless.

[2]NETtalk: A Parallel Network That Learns to Read Aloud

T. J. Sejnowski, C. R. Rosenberg - 1986

6 papers in library cite

Google Scholar

[3]Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition

L. Bahl, P. Brown, P. D. Souza, R. Mercer - 1986

4 papers in library cite

Google Scholar

[4]Boltzmann Machines, Constraint Satisfaction Networks That Learn

Geoffrey Hinton, T. Sejnowski, D. Ackley - 1984

3 papers in library cite

Google Scholar

[5]Accelerated Learning in Layered Neural Networks

Sara A. Solla, E. Levin, M. Fleisher - 1988

2 papers in library cite

Google Scholar

[6]Control Methods Used in a Study of the Vowels

G. E. Peterson, H. L. Barney - 1952

2 papers in library cite

Google Scholar

[7]Supervised Learning of Probability Distributions by Neural Networks

E. B. Baum, F. Wilczek - 1988

2 papers in library cite

Google Scholar

[8]Neural Net and Traditional Classifiers

W. M. Huang, R. P. Lippmann - 1988

1 paper in library cites

Google Scholar

[9]Principles of Digital Communication and Coding

A. J. Viterbi - 1979

1 paper in library cites

Google Scholar

[10]Probability Scores for Backpropagation Networks

L. Gillick - 1987

1 paper in library cites

Google Scholar

[11]The Boltzmann Perceptron Network: A Soft Classifier

E. Yair, A. Gersho - 1988

1 paper in library cites

Google Scholar

[12]The Theory of Stochastic Processes

D. R. Cox, H. D. Millar - 1965

1 paper in library cites

Google Scholar

Cited by

papers in your library

Cites

papers in your library

Read

on June 23, 2025

Explains using softmax as output. Nice read, but I am not sure if too relevant.