Loading [a11y]/accessibility-menu.js
Model complexity control for regression using VC generalization bounds | IEEE Journals & Magazine | IEEE Xplore

Model complexity control for regression using VC generalization bounds


Abstract:

It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any metho...Show More

Abstract:

It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any method for learning from finite samples needs to have some provisions for complexity control. Existing implementations of complexity control include penalization (or regularization), weight decay (in neural networks), and various greedy procedures (aka constructive, growing, or pruning methods). There are numerous proposals for determining optimal model complexity (aka model selection) based on various (asymptotic) analytic estimates of the prediction risk and on resampling approaches. Nonasymptotic bounds on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regression problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear models and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-based complexity control with finite samples.
Published in: IEEE Transactions on Neural Networks ( Volume: 10, Issue: 5, September 1999)
Page(s): 1075 - 1089
Date of Publication: 30 September 1999

ISSN Information:

PubMed ID: 18252610
References is not available for this document.

Select All
1.
V. Vapnik, Estimation of Dependencies Based on Empirical Data, 1982.
2.
V. Vapnik, The Nature of Statistical Learning Theory, 1995.
3.
"An overview of predictive learning and function approximation", From Statistics to Neural Networks: Theory and Pattern Recognition Applications, vol. 136, 1994.
4.
B. D. Ripley, Pattern Recognition and Neural Networks, 1996.
5.
V. Cherkassky and F. Mulier, Learning from Data: Concepts Theory and Methods, 1998.
6.
W. Hardle, P. Hall and J. S. Marron, "How far are automatically chosen regression smoothing parameters from their optimum?", JASA 83, pp. 86-95, 1988.
7.
H. Akaike, "Statistical predictor information", Ann. Inst. Statist. Math., vol. 22, pp. 203-217, 1970.
8.
G. Shwartz, "Estimating the dimension of a model", Ann. Statist., vol. 6, pp. 461-464, 1978.
9.
P. Craven and G. Wahba, "Smoothing noisy data with spline functions", Numerische Math., vol. 31, pp. 377-403, 1979.
10.
R. Shibata, "An optimal selection of regression variables", Biometrika, vol. 68, pp. 45-54, 1981.
11.
A. J. Miller, Subset Selection in Regression, 1990.
12.
J. Shao, "Linear model selection by cross-validation", JASA 88, vol. 422, pp. 486-494, 1993.
13.
J. Rissanen, Stochastic Complexity and Statistical Inquiry, 1989.
14.
D. Foster and E. George, "The risk inflation criteria for multiple regression", Ann. Statist., vol. 22, pp. 1947-1975, 1994.
15.
J. E. Moody, "Note on generalization regularization and architecture selection in nonlinear learning systems", Proc. 1st IEEE-SP Wkshp. Neural Networks Signal Processing, pp. 1-10, 1991.
16.
N. Murata, S. Yoshisawa and S. Amari, "Neural-network information criterion determining the number of hidden units for artificial neural network models", IEEE Trans. Neural Networks, vol. 5, pp. 865-872, 1994.
17.
V. Vapnik, Statistical Learning Theory, 1998.
18.
T. Hastie and R. Tibshirani, Generalized Additive Models, 1990.
19.
C. Bishop, Neural Networks for Pattern Recognition, 1995.
20.
V. Vapnik, E. Levin and Y. Le Cun, "Measuring the VC-dimension of a learning machine", Neural Comput., vol. 6, pp. 851-876, 1994.
21.
V. Cherkassky and X. Shao, "Model selection for wavelet-based signal estimation", Proc. IJCNN-98, 1998.
22.
D. Schuurmans, "A new metric-based approach to model selection", Proc. AAAI-97, 1997.
23.
V. Cherkassky, F. Mulier and V. Vapnik, "Comparison of VC-method with classical methods for model selection", Proc. WCNN-96, 1996.

Contact IEEE to Subscribe

References

References is not available for this document.