Software and datasets to support 'Modern Applied Statistics with S', fourth edition, by W. N. Venables and B. D. Ripley. Springer, 2002, ISBN 0-387-95457-0. This file documents software changes since the third edition. - eqscplot has new arguments ratio and uin. - stepAIC will not drop strata terms in coxph or survreg models. - profile.glm will report inadequate supplied glm fits, not just fail. - new method confint.lm. - fractions/rational allow missing values. - mvrnorm has an 'empirical' argument. - predict.lda and predict.qda try harder to avoid exponential underflow. - new function fitdistr for ML estimation of univariate distributions. - new function glmmPQL to use lme to fit GLMMs by PQL - truehist allows rule for nbins to be specified as a character string. - parcoord function. - new datasets bacteria, epil, nlschools, SP500 - polr allows control argment for optim, reports lack of convergence. - stepAIC works again if formula has an offset (R had changed). - biplot.correspondence now shows the origin as a cross. - polr was not preserving contrasts to put in the fit object. - vcov methods for lme, gls, coxph and survReg. - Added 'tol' argument to isoMDS. - stepAIC now allow 'direction=both' starting from a full model. - glm.nb allows R-style 'start' argument. - truehist passes ... on to both plot.default() and rect(). - isoMDS now uses the C interface to optim. - addterm, dropterm, stepAIC now work with lme and gls fits. - huber checks for MAD equal to zero. - glmmPQL now loads nlme if not already loaded. - glmmPQL handles list 'random' arguments (7.0-11). - The MASS datasets no longer require data(foo) to load them. (7.0-11) - mvrnorm uses eigen(EISPACK=TRUE) for back-compatibility (7.0-11, R 1.7.0) - print.summary.polr could lose dimnames for 1 coefficient. - remove heart as survival in R now has it. - confint.{lm,glm} didn't handle specifying parm in all cases. - confint and confint.lm have been migrated to base in R. - addterm.default, dropterm.default and stepAIC work better inside functions. - glm.nb now sets AIC in the object, and has a logLik() method. - truehist now accepts a 'ylab' argument. - negative.binomial and neg.bin no longer generate objects with package:MASS in their environment. - stepAIC now drops (if allowed) 0-df terms sequentially from the right. - lda(CV=TRUE) now works for rank-deficient fits. - predict methods for lda, polr now check newdata types. - model.frame.lda/polr now look for the environment of the original formula. - polr has a new `model' argument defaulting to TRUE. - fitdistr supports the trivial case of a Normal distribution. - sammon and isoMDS now allow missing values in the dissimilarity matrix, and isoMDS allows Minkowski distances in the configuration space. - cov.trob works better if wts are supplied, and may converge a little faster in any case. - The ch11.R script now uses mclust not mclust1998. - The default xlab for boxcox() is now greek lambda. - glmmPQL now handles offset terms. - add predict.rlm method to correct predict.lm in the case se.fit=TRUE. - weighted rlm fits are handled better, and default to "inv.var". - logtrans works without specifying 'data'. - predict() method for glmmPQL. - polr() has an option for probit or proportional hazard fits. - neg.bin() and negative.binomial() had an error in the aic() formula. - The ch05.R script now includes the code for Figure 5.8. - Datasets austres, fdeaths, lh, mdeaths, nottem and rock are now visible in the 'datasets' package of R 2.0.0 and so have been removed here. - Script ch07.R now gives details using the gam() function in package gam as well as that in package mgcv. - rlm's fitted component is now always unweighted. - theta.{md,ml,mm} now have one help file with examples. - polr() has a new method "cauchit" suggested by Roger Koenker. (Requires R >= 2.1.0) - polr() now works with transformed intercepts, and usually converges better (contributed by David Firth). - polr() handles a rank-deficient model matrix. - polr() now returns the method used, and uses it for predictions. - anova() method for polr (contributed by John Fox). - predict.glmmPQL was not using the na.action in the object as intended. - The default methods for addterm and dropterm and anova.polr now check for changes in the number of cases in use caused e.g. by na.action=na.omit. - Added vcov() method for rlm fits. - eqscplot() accepts reversed values for xlim and ylim. - Script ch10.R uses se.contrast to calculate se's missing from model.tables. - profile() and confint() methods for polr(). - glm.convert() was not setting the `offset' component that R's glm objects have. - sammon() now checks for duplicates in the initial configuration. - isoMDS() and sammon() work around dropping of names.dist in 2.1.0 - lda() now gives an explicit error message if all group means are the same. - fitdistr() now has a logLik() method, chooses the optim() method if not supplied, handles the log-normal by closed-form and no longer attempts to handle the uniform. - glm.nb() now accepts 'mustart'. - glm.nb() now supports weights: they used to be ignored when estimating theta. - fitdistr() now supports geometric and Poisson distributions, and uses closed-form results for the exponential. - lm.ridge, lqs and rlm allow offset() terms. - the 'prior' argument of predict.qda is now operational. - script ch12.R now has b1() adapted for R's contour(). - anova.polr() quoted model dfs, not residual dfs. - stepAIC() applied to a polr fit now gets the correct rdf's in the anova table. - lm.gls() now returns fitted values and residuals on the original coordinates (not the uncorrelated ones). - parcoord() now allows missing values and has a new argument 'var.label' to label the variable axes. (Contributed by Fabian Scheipl.) - rlm() has a 'lqs.control' argument passed to lqs() where used for initialization. - rlm() could fail with some psi functions (e.g. psi.hampel) if 'init' was given as a numeric vector. - rlm() handles weighted fits slightly differently, in particular trying to give the same scale estimate if wt.method="case" as if replicating the cases. - confint.nls copes with plinear models in R (now profile.nls does). - The wrappers lmsreg() etc have been adapted to work in the MASS namespace. - qda() accepts formulae containing backquoted non-syntactic names. - polr() gives an explicit error message if 'start' is misspecified. - glmmPQL() evaluates the formulae for 'fixed' and 'random', which may help if they are given as variables and not values. - There are anova() and logLik() methods for class "glmmPQL" to stop misuse. - profile.polr() now works for a single-coefficient model. - The print and print.summary methods for polr and rlm make use of naprint() to print a message e.g. about deleted observations. - Class "ridgelm" now has a coef() method, and works for n < p. - lda() and qda() now check explicitly for non-finite 'x' values. - ch06.R has been updated for multcomp >= 0.991-1 - profile.glm is more likely to find the model frame in complicated scopes. - message() is used for most messages. - truehist() checks more thoroughly for erroneous inputs. - polr(model=TRUE) works again. - add logLik() method for polr. - the summary() methods for classes "negbin" and "rlm" now default to correlation = FALSE. - there is a vcov() method for class "negbin": unlike the "glm" method this defaults to dispersion = 1. - coding for 'sex' in ?Melanoma has been corrected. - the example for gamma.shape has a better starting point and so converges - avoid abbreviation of survreg(dist=) in example(gehan) - profile() and confint() methods for "glm" objects now handle rank-deficient fits. - profile.glm() produced an output in a format plot.profile could not read for single-variable fits. Also for confint() on intercept-only fits. - The print() methods for fitdistr() and lm.ridge() now return invisibly. - vcov() and profile() methods for polr() used starting values in the external not internal parametrization, which could slow convergence. - glm.nb() called theta.ml() incorrect when weights were supplied whch did not sum to n. - removed unused argument 'nseg' to plot.profile. - 'alpha' in the "glm" and "polr" methods for profile() is now interpreted as two-tailed univariate for consistency with other profile methods. - 'mammals': corrected typos in names, some thanks to Arni Magnusson. - profile.glm() now works for binomial glm specified with a matrix response and a cmpletely zero row. - there is a "negbin" method for simulate() - the use of package mclust has been removed from the ch11.R script because of the change of licence conditions for that package. - change ch13.R script for change in package 'survival' 2.35-x. - glmmPQL looks up variables in its 'correlation' argument (if a formula) in the usual scope (wish of Ben Bolker: such arguments are unsupported). - added a simulate() method for unweighted polr() fits. - kde2d() allows a length-2 argument 'n'. - the default for truehist(col=) is now set to a colour, not a colour number. - the returned fitted values and (undocumented) linear predictor for polr() did not take any offset into account (reported by Ioannis Kosmides). - the vcov() method for polr() now returns on the zeta scale (suggested by Achim Zeileis). - fitdistr() gains a vcov() method (suggested by Achim Zeileis). - ch06.R has R alternatives to fac.design. - ch11.R has R alternatives for ggobi and factor rotation. - hubers() copes in extreme cases when middle 50% of data is constant. - tests/ now includes dataset for polr.R, so checking depends only on base packages and lattice. - The "glm" method for profile() failed when given a binomial model with a two-column response. - fitdistr() works harder to rescale the problem when fitting a gamma. - cov.trob() handles zero weights without giving a warning (reported by John Fox). - boxcox() works better when 'y' is very badly scaled, e.g. around 1e-16 (patch by Martin Maechler). - mvrnorm() no longer defaults to the deprecated EISPACK=TRUE (and hence changes the results). It gains an argument 'EISPACK' for back-compatibility. - the "polr" method for profile() could lose dimensions in its return object (reported by Joris Meys) - kde2d() throws an error if given zero bandwidths or constant data. - ldahist(sep = TRUE) was missing a dev.flush(). - addterm.glm() mis-calculated F statistics for df > 1. - anova.loglm() needed revision for changes in R. - The addterm() default method allows update() to fail. - polr(method = "cloglog") implemented what is more commonly called the log-log link. Now both are provided. - lqs() fits with intercepts and contrasts lost the latter from the return value. - addterm() and dropterm() now handle transparenty empty scopes.