Towardss an Explanation Facility for Feed-Forward Neural Networks

M. Egmont-Petersen

Dept. of Biophysics, University of Limburg, The Netherlands

E. Pelikan

Institut für Medizinische Statistik und Informationsverarbeitung, Abteilung Medizinische Informatik Universitätsklinikum Benjamin Franklin, FU-Berlin.

Keywords Explanation, neural networks, feature assessment, information content, Bayesian analysis


Feed-forward neural networks have obtained a widespread use for classification tasks since their reintroduction by the PDP-group in the mid eighties. Based on a number of features, cases are classified into one of c classes. When they outperform other types of classifiers, neural networks can be useful for low-level information processing tasks such as, for example, segmentation of CT-images or screening of blood samples in a clinical chemistry laboratory. Their usefulness for higher-order information processing such as the establishment of a diagnosis is, however, impeded by their black-box character [1]. When a case has been classified, it is almost impossible to identify the subset of features that was crucial for assigning the particular class label. A few approaches to explanation have been suggested [2] but they rely on very strict assumptions with respect to the distributions of the features and may generate misleading explanations.

We analyze the problem of explaining the classification of a case by a Bayesian classifier. We define a metric that measures whether a feature is relevant for the classification of a case. Relevant features in a case are those that could cause the classification to change if they were reobserved. Secondly, the relevant features are ranked according to their importance, which is measured by the conditional probability that the case would obtain a different class label if the particular feature were to be reobserved.

It is shown how the Bayesian definitions of relevance and importance apply to a neural network. We define a metric to assess which features are relevant for the classification of a case by a neural network and a metric to rank the features according to their importance for the classification. Both metrics use a numeric approach, which has been developed for other purposes, to identify the conditional class-boundaries.

[1] A. Hart, J. Wyatt. "Connectionist models in medicine: an investigation of their potential", In: AIME- 89, Springer Verlag, Heidelberg, pp. 115- 124, 1989.

[2] R.F. Harrison, S.J. Marshall, R.L. Kennedy. "A connectionist aid to the early diagnosis of myocardial infarction", In: AIME- 91, Springer Verlag, Heidelberg, pp. 119- 128, 1991.