1. Introduction
Resource planning is considered a key issue in a production environment. In the context of a software developing company, the different resources are, among others, computing power and personnel. In recent years, computing power has become a subordinate resource for software developing companies as it doubles approximately every 18 months, thereby costing only a fraction compared to the late 1960s. Personnel costs are, however, still an important expense in the budget of software developing companies. In light of this observation, proper planning of personnel effort is a key aspect for these companies. Due to the intangible nature of the product “software,” software developing companies are often faced with problems estimating the effort needed to complete a software project [1]. There has been strong academic interest in this topic, assisting the software developing companies in tackling the difficulties experienced to estimate software development effort [2]. In this field of research, the required effort to develop a new project is estimated based on historical data from previous projects. This information can be used by management to improve the planning of personnel, to make more accurate tendering bids, and to evaluate risk factors [3]. Recently, a number of studies evaluating different techniques have been published. The results of these studies are not univocal and are often highly technique and data set dependent. In this paper, an overview of the existing literature is presented. Furthermore, 13 techniques, representing different kinds of models, are investigated. This selection includes tree/rule-based models (M5 and CART), linear models (ordinary least squares regression with and without various transformations, ridge regression (RiR), and robust regression (RoR)), nonlinear models (MARS, least squares support vector machines, multilayered perceptron neural networks (NN), radial basis function (RBF) networks), and a lazy learning-based approach which does not explicitly construct a prediction model, but instead tries to find the most similar past project. Each technique is applied to nine data sets within the domain of software effort estimation. From a comprehensibility point of view, a more concise model (i.e., a model with less inputs) is preferred. Therefore, the impact of a generic backward input selection approach is assessed.