ziP {mgcv} | R Documentation |
Family for use with gam
, implementing regression for zero inflated Poisson data
when the complinetart log log of the zero probability is linearly dependent on the log of the Poisson parameter. Use with care,
noting that simply having many zero response observations is not an indication of zero inflation: the question is whether you have too many zeroes given the specified model.
ziP(theta = NULL, link = "identity")
theta |
the 2 parameters controlling the slope and intercept of the linear transform of the mean controlling the zero inflation rate. If supplied (and second element is positive) then treated as fixed parameters (theta_1 and exp(theta_2)), otherwise estimated. If supplied and second element is negative then treated as starting values (with sign of second element reversed). |
link |
The link function: only the |
The probability of a zero count is given by 1- p, whereas the probability of
count y>0 is given by the truncated Poisson probability function (pmu^y/((exp(mu)-1)y!). The linear predictor
gives log(mu), while eta=log(-log(1-p)) and eta = theta_1 + exp(theta_2) log(mu). The theta
parameters are estimated alongside the smoothing parameters.
Note that the fitted values for this model are the log of the Poisson parameter. Use the predict
function with type=="response"
to get the predicted expected response.
An object of class extended.family
.
Zero inflated models are often over-used. Having lots of zeroes in the data does not in itself imply zero inflation. Having too many zeroes *given the model mean* may imply zero inflation.
Simon N. Wood simon.wood@r-project.org
rzip <- function(gamma,theta= c(-2,.3)) { ## generate zero inflated Poisson random variables, where ## lambda = exp(gamma), eta = theta[1] + exp(theta[2])*gamma ## and 1-p = exp(-exp(eta)). y <- gamma; n <- length(y) lambda <- exp(gamma) eta <- theta[1] + exp(theta[2])*gamma p <- 1- exp(-exp(eta)) ind <- p > runif(n) y[!ind] <- 0 np <- sum(ind) ## generate from zero truncated Poisson, given presence... y[ind] <- qpois(runif(np,dpois(0,lambda[ind]),1),lambda[ind]) y } library(mgcv) ## Simulate some beta data... set.seed(1);n<-400 dat <- gamSim(1,n=n) dat$y <- rzip(dat$f/4-1) b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=ziP(),data=dat) b$outer.info ## check convergence!! b plot(b,pages=1) ## plot deviance residuals against expected response plot(predict(b,type="response"),residuals(b))