Recently I have been looking into ways to improve model fitting on a wide range of data sets from insurance, pharmaceuticals and finance. As part of that, I came across general additive models (GAMs), which are a nice class of models that introductory statistics courses often don’t cover in much depth. What follows is a very brief introduction to what GAMs are and why a modeller might find them useful.

The general additive model extends the general linear model by specifying a distribution (normal, binomial, Poisson), a link function g, and predictors x1, x2, …

g(E(y)) = s(x1) + s(x2) + … + error

The functions s can be fit using semiparametric methods so they offer much more flexibility than linear models. Typically we would use smoothing splines which allows sharpness of the fitted function to be penalized. R allows crossvalidation to check the predictive error of the in-sample against out-of-sample points

The rationale for using GAMs versus other models is they represent an intermediate between fully specified linear models and totally unspecified nonparametric models such as k-means.

The weakness of GLM is the real regression function is hardly ever linear, so even with a lot of data it will never be unbiased. The benefit of GLM is it converges very fast (O(n-1)) to a stable answer with more data, so it works in high dimensions (ie. lots of predictor variables).

At the other extreme of totally nonparametric models with kernel regression, k-means clustering, they can fit any function so the approximation bias is zero, but with high dimensional data they can converge to the true answer very slowly (O(n-4/(p+4))) due to the sheer number of possibilities for the approximating function. So these don’t do well with high dimensional data.

That’s where the case for additive models comes in. The assumption of additive functions means the bias can sometimes be nonzero, but they will generally have less bias if the data is not really linear in the predictors, and they converge to the expected value nearly as fast as linear models (O(n-4/5)).


General Additive Models, SN Wood

Advanced Data Analysis from an Elementary Point of View, CR Shalizi

Example on binary data with binomial regression.

Figure. Contribution of nonparametric function s(x) to link of binary outcome for simulated data set.


Our blog and many others are also available at –