| Title: | Design-Based Inference in Vector Generalised Linear Models |
|---|---|
| Description: | Provides inference based on the survey package for the wide range of parametric models in the 'VGAM' package. |
| Authors: | Thomas Lumley [aut, cre] |
| Maintainer: | Thomas Lumley <[email protected]> |
| License: | GPL-3 |
| Version: | 1.3 |
| Built: | 2026-05-23 15:13:57 UTC |
| Source: | https://github.com/tslumley/svyvgam |
These data are from the NHANES 2003-2004 survey in the US. They provide an example of overdispersed count data that motivates a two-component zero-inflation model
data("nhanes_sxq")data("nhanes_sxq")
A data frame with 2992 observations on the following 7 variables.
SDMVPSUPrimary Sampling Unit
SDMVSTRAstratum
WTINT2YRweights
malepartnerslifetime number of male sexual partners
RIDAGEYRage in years
DMDEDUClevel of education: 1=less than high school, 2=high school, 3-more than high school, 7=refused
RIDRETH1Race/ethnicity: 1=Mexican American, 2=Other Hispanic, 4=non-Hispanic White, 5=non-Hispanic Black, 5=Other
NHANES files demo_c.xpt and sxq_c.xpt
Construction of the data set is described by https://notstatschat.rbind.io/2015/05/26/zero-inflated-poisson-from-complex-samples/
data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef")data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef")
This function provides design-based (survey) inference for Thomas Yee's
vector generalised linear models. It works by calling vglm with
sampling weights, and then either using resampling (replicate weights)
or extracting the influence functions and using a Horvitz-Thompson-type
sandwich estimator.
svy_vglm(formula, family, design, ...)svy_vglm(formula, family, design, ...)
formula |
Model formula, as for |
family |
Model family, as for |
design |
Survey design object |
... |
Other arguments to pass to |
An S3 object of class svy_glm with print, coef,
vcov, and predict methods, containing the design in
the design component and a fitted vglm object in the
fit component.
data(api) dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2) ## Ordinary Gaussian regression m1<-svyglm(api00~api99+mobility+ell, design=dclus2,family=gaussian) ## same model, but with the variance as a second parameter m2<-svy_vglm(api00~api99+mobility+ell, design=dclus2,family=uninormal()) m1 m2 SE(m1) SE(m2) summary(m1) summary(m2) ## Proportional odds model dclus2<-update(dclus2, mealcat=as.ordered(cut(meals,c(0,25,50,75,100)))) a<-svyolr(mealcat~avg.ed+mobility+stype, design=dclus2) b<-svy_vglm(mealcat~avg.ed+mobility+stype, design=dclus2, family=propodds()) a b SE(a) SE(b) #not identical, because svyolr() uses approximate Hessian ## Zero-inflated Poisson data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) sv1<-svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef") sv1 summary(sv1) ## Multinomial ## Reference group (non-Hispanic White) average older and more educated ## so coefficients are negative mult_eth<- svy_vglm(RIDRETH1~RIDAGEYR+DMDEDUC, family=multinomial(refLevel=3), design=nhdes) ## separate logistic regressions are close but not identical two_eth<-svyglm(I(RIDRETH1==1)~RIDAGEYR+DMDEDUC, family=quasibinomial, design=subset(nhdes, RIDRETH1 %in% c(1,3))) summary(mult_eth) summary(two_eth)data(api) dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2) ## Ordinary Gaussian regression m1<-svyglm(api00~api99+mobility+ell, design=dclus2,family=gaussian) ## same model, but with the variance as a second parameter m2<-svy_vglm(api00~api99+mobility+ell, design=dclus2,family=uninormal()) m1 m2 SE(m1) SE(m2) summary(m1) summary(m2) ## Proportional odds model dclus2<-update(dclus2, mealcat=as.ordered(cut(meals,c(0,25,50,75,100)))) a<-svyolr(mealcat~avg.ed+mobility+stype, design=dclus2) b<-svy_vglm(mealcat~avg.ed+mobility+stype, design=dclus2, family=propodds()) a b SE(a) SE(b) #not identical, because svyolr() uses approximate Hessian ## Zero-inflated Poisson data(nhanes_sxq) nhdes = svydesign(id=~SDMVPSU,strat=~SDMVSTRA,weights=~WTINT2YR, nest=TRUE, data=nhanes_sxq) sv1<-svy_vglm(malepartners~RIDAGEYR+factor(RIDRETH1)+DMDEDUC, zipoisson(), design=nhdes, crit = "coef") sv1 summary(sv1) ## Multinomial ## Reference group (non-Hispanic White) average older and more educated ## so coefficients are negative mult_eth<- svy_vglm(RIDRETH1~RIDAGEYR+DMDEDUC, family=multinomial(refLevel=3), design=nhdes) ## separate logistic regressions are close but not identical two_eth<-svyglm(I(RIDRETH1==1)~RIDAGEYR+DMDEDUC, family=quasibinomial, design=subset(nhdes, RIDRETH1 %in% c(1,3))) summary(mult_eth) summary(two_eth)