write.big.matrix, read.big.matrix {bigmemory} | R Documentation |
Create a big.matrix
by reading from a
suitably-formatted ASCII file, or
write the contents of a big.matrix
to a file.
write.big.matrix(x, fileName = NA, row.names = FALSE, col.names = FALSE, sep=',') read.big.matrix(fileName, sep = ',', header = FALSE, col.names = NULL, row.names = NULL, has.row.names=FALSE, ignore.row.names=FALSE, type = NA, skip = 0, separated = FALSE, shared = FALSE, backingfile = NULL, backingpath = NULL, descriptorfile = NULL, extraCols = NULL)
x |
a big.matrix . |
fileName |
the name of an input/output file. |
sep |
a field delimiter. |
header |
if TRUE , the first line (after a possible skip) should contain column names. |
col.names |
a vector of names, use them even if column names exist in the file. |
row.names |
a vector of names, use them even if row names appear to exist in the file. |
has.row.names |
if TRUE , then the first column contains row names. |
ignore.row.names |
if TRUE when has.row.names==TRUE , the row names will be ignored. |
type |
preferably specified, "integer" for example. |
skip |
number of lines to skip at the head of the file. |
separated |
use separated column organization of the data instead of column-major organization. |
shared |
if TRUE , load the object into shared memory. |
backingfile |
the root name for the file(s) for the cache of x . |
backingpath |
the path to the directory containing the file backing cache. |
descriptorfile |
the file to be used for the description of the filebacked matrix. |
extraCols |
the optional number of extra columns to be appended to the matrix for future use. |
Files must contain only one atomic type
(all integer
, for example). You, the user, should know whether
your file has row and/or column names, and various combinations of options
should be helpful in obtaining the desired behavior.
When reading from a file, if type
is not specified we try to
make a reasonable guess for you without
making any guarantees at this point.
Unless you have really large integer values, we strongly recommend
you consider "short"
. If you have something that is essentially
categorical, you might even be able use "char"
, with huge memory
savings in large data sets.
a big.matrix
object is returned by read.big.matrix
, while
write.big.matrix
creates an output file in the present working directory.
John W. Emerson and Michael J. Kane
# Without specifying the type, this big.matrix x will hold integers. x <- as.big.matrix(matrix(1:10, 5, 2)) x[2,2] <- NA x[,] write.big.matrix(x, "foo.txt") # Just for fun, I'll read it back in as character (1-byte integers): y <- read.big.matrix("foo.txt", type="char") y[,] # Other examples: w <- as.big.matrix(matrix(1:10, 5, 2), type='double') w[1,2] <- NA w[2,2] <- -Inf w[3,2] <- Inf w[4,2] <- NaN w[,] write.big.matrix(w, "bar.txt") w <- read.big.matrix("bar.txt", type="double") w[,] w <- read.big.matrix("bar.txt", type="short") w[,] # Another example using row names (which we don't like). x <- as.big.matrix(as.matrix(iris), type='double') rownames(x) <- as.character(1:nrow(x)) head(x) write.big.matrix(x, 'IrisData.txt', col.names=TRUE, row.names=TRUE) y <- read.big.matrix("IrisData.txt", header=TRUE, has.row.names=TRUE) head(y) # The following would fail with a dimension mismatch: if (FALSE) y <- read.big.matrix("IrisData.txt", header=TRUE)