biglm.big.matrix, bigglm.big.matrix {bigmemory}R Documentation

Use Thomas Lumley's “biglm” package with a “big.matrix”

Description

This is a wrapper to Thomas Lumley's biglm package, allowing its use with data stored in big.matrix objects.

Usage

biglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL, 
  getNextChunkFunc=NULL)
bigglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL,
  getNextChunkFunc=NULL)

Arguments

formula a model formula.
data a big.matrix or data.frame object.
chunksize an integer maximum size of chunks of data to process iteratively; if this argument is not given, a suitable default is supplied
fc the names of variables that are factors
getNextChunkFunc a function which generates the next set of indices for the next chunk; if this argument is not given, a suitable default is supplied
... the other parameters which can be specified are those supported by biglm and bigglm

Details

See biglm package for more information; chunksize defaults to
floor(nrow(data)/ncol(data)^2).

These functions will be removed from bigmemory and located in a new package, biganalytics or bigmemoryanalytics, in the Fall of 2009.

Value

an object of class biglm.

Author(s)

Michael J. Kane

References

Algorithm AS274 Applied Statistics (1992) Vol. 41, No.2

Thomas Lumley (2005). biglm: bounded memory linear and generalized linear models. R package version 0.7.

See Also

big.matrix

Examples

# This example is quite silly, using the iris
# data.  But it shows that our wrapper to Lumley's biglm() function produces
# the same answer as the plain old lm() function.

## Not run: 
x <- matrix(unlist(iris), ncol=5)
colnames(x) <- names(iris)
x <- as.big.matrix(x)
head(x)

silly.biglm <- biglm.big.matrix(Sepal.Length ~ Sepal.Width + Species, data=x, fc="Species")
summary(silly.biglm)

y <- data.frame(x[,])
y$Species <- as.factor(y$Species)
head(y)

silly.lm <- lm(Sepal.Length ~ Sepal.Width + Species, data=y)
summary(silly.lm)

## End(Not run)

[Package bigmemory version 3.12 Index]