Evaluation Robust but Robust to What?

From National Research Council Canada

Download	View accepted manuscript: Evaluation Robust but Robust to What? (PDF, 222 KiB)
Author	Search for: Drummond, Chris
Format	Text, Article
Conference	AAAI-07 Workshop on Methods for Machine Learning II, July 22, 2007, Vancouver, British Columbia, Canada
Abstract	Generalization is at the core of evaluation, we estimate the performance of a model on data we have never seen but expect to encounter later on. Our current evaluation procedures assume that the data already seen is a random sample of the domain from which all future data will be drawn. Unfortunately, in practical situations this is rarely the case. Changes in the underlying probabilities will occur and we must evaluate how robust our models to such differences. This paper takes the position that models should be robust in two senses. Firstly, that any small changes in the joint probabilities should not cause large changes in performance. Secondly, that when the dependencies between attributes and the class are constant and only the marginal change, simple adjustments should be sufficient to restore a model's performance. This paper is intended to generate debate on how measures of robustness might become part of our normal evaluation procedures. Certainly some clear demonstrations of robustness would improve our confidence in our models' practical merits.
Publication date	2007
In	AAAI-07 Workshop on Methods for Machine Learning II [Proceedings].
Language	English
NRC number	NRCC 49344
NPARC number	8914274
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	d372f9ff-e1b2-4547-8197-d11ba4fa57fe
Record created	2009-04-22
Record modified	2020-08-12

Date modified:: 2024-12-26