Title: | Design-Based Inference in Vector Generalised Linear Models |
---|---|
Description: | Provides inference based on the survey package for the wide range of parametric models in the 'VGAM' package. |
Authors: | Thomas Lumley |
Maintainer: | Thomas Lumley <[email protected]> |
License: | GPL-3 |
Version: | 1.2-1 |
Built: | 2024-11-02 03:15:43 UTC |
Source: | https://github.com/tslumley/svyvgam |
These data are from the NHANES 2003-2004 survey in the US. They provide an example of overdispersed count data that motivates a two-component zero-inflation model
data("nhanes_sxq")
data("nhanes_sxq")
A data frame with 2992 observations on the following 7 variables.
SDMVPSU
Primary Sampling Unit
SDMVSTRA
stratum
WTINT2YR
weights
malepartners
lifetime number of male sexual partners
RIDAGEYR
age in years
DMDEDUC
level of education: 1=less than high school, 2=high school, 3-more than high school, 7=refused
RIDRETH1
Race/ethnicity: 1=Mexican American, 2=Other Hispanic, 4=non-Hispanic White, 5=non-Hispanic Black, 5=Other
NHANES files demo_c.xpt
and sxq_c.xpt
Construction of the data set is described by https://notstatschat.rbind.io/2015/05/26/zero-inflated-poisson-from-complex-samples/
data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef")
data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef")
This function provides design-based (survey) inference for Thomas Yee's
vector generalised linear models. It works by calling vglm
with
sampling weights, and then either using resampling (replicate weights)
or extracting the influence functions and using a Horvitz-Thompson-type
sandwich estimator.
svy_vglm(formula, family, design, ...)
svy_vglm(formula, family, design, ...)
formula |
Model formula, as for |
family |
Model family, as for |
design |
Survey design object |
... |
Other arguments to pass to |
An S3 object of class svy_glm
with print
, coef
and vcov
methods, containing the design in the design
component and a
fitted vglm
object in the fit
component.
data(api) dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2) ## Ordinary Gaussian regression m1<-svyglm(api00~api99+mobility+ell, design=dclus2,family=gaussian) ## same model, but with the variance as a second parameter m2<-svy_vglm(api00~api99+mobility+ell, design=dclus2,family=uninormal()) m1 m2 SE(m1) SE(m2) summary(m1) summary(m2) ## Proportional odds model dclus2<-update(dclus2, mealcat=as.ordered(cut(meals,c(0,25,50,75,100)))) a<-svyolr(mealcat~avg.ed+mobility+stype, design=dclus2) b<-svy_vglm(mealcat~avg.ed+mobility+stype, design=dclus2, family=propodds()) a b SE(a) SE(b) #not identical, because svyolr() uses approximate Hessian ## Zero-inflated Poisson data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) sv1<-svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef") sv1 summary(sv1) ## Multinomial ## Reference group (non-Hispanic White) average older and more educated ## so coefficients are negative mult_eth<- svy_vglm(RIDRETH1~RIDAGEYR+DMDEDUC, family=multinomial(refLevel=3), design=nhdes) ## separate logistic regressions are close but not identical two_eth<-svyglm(I(RIDRETH1==1)~RIDAGEYR+DMDEDUC, family=quasibinomial, design=subset(nhdes, RIDRETH1 %in% c(1,3))) summary(mult_eth) summary(two_eth)
data(api) dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2) ## Ordinary Gaussian regression m1<-svyglm(api00~api99+mobility+ell, design=dclus2,family=gaussian) ## same model, but with the variance as a second parameter m2<-svy_vglm(api00~api99+mobility+ell, design=dclus2,family=uninormal()) m1 m2 SE(m1) SE(m2) summary(m1) summary(m2) ## Proportional odds model dclus2<-update(dclus2, mealcat=as.ordered(cut(meals,c(0,25,50,75,100)))) a<-svyolr(mealcat~avg.ed+mobility+stype, design=dclus2) b<-svy_vglm(mealcat~avg.ed+mobility+stype, design=dclus2, family=propodds()) a b SE(a) SE(b) #not identical, because svyolr() uses approximate Hessian ## Zero-inflated Poisson data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) sv1<-svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef") sv1 summary(sv1) ## Multinomial ## Reference group (non-Hispanic White) average older and more educated ## so coefficients are negative mult_eth<- svy_vglm(RIDRETH1~RIDAGEYR+DMDEDUC, family=multinomial(refLevel=3), design=nhdes) ## separate logistic regressions are close but not identical two_eth<-svyglm(I(RIDRETH1==1)~RIDAGEYR+DMDEDUC, family=quasibinomial, design=subset(nhdes, RIDRETH1 %in% c(1,3))) summary(mult_eth) summary(two_eth)