The usethnicity
data set contains variables on
race and ethnic identification from the 2017 Youth Risk Behaviour
Survey, together with two variables on smoking behaviour. The YRBS is a
multistage cluster-sampled survey, so valid inference about associations
requires using survey design information. This subset of variables
without weights is useful only for demonstration purposes.
## Q4 Q5 QN30 QN31
## 1 2 E 2 2
## 2 1 2 2
## 3 1 A 1 2
## 4 1 2 2
## 5 2 E 2 2
## 6 2 E 1 1
Question 4 asks Are you Hispanic or Latino?, and Question 5 asks for any of
that apply. In the data set, these five letters are pasted together into a single variable.
We need to split Q5
into its component letters. The
method for character strings does this
## A B C D E F G H
## 847 863 1014 3643 455 8306 5 1 7
There’s a spurious " "
category from the string
splitting, and the values F
, G
, and
H
are also invalid, so we need to remove them
## A B C D E
## 863 1014 3643 455 8306
We might want easier-to-recognise names for the categories
Now, Hispanic/Latino ethnicity is asked in a separate question. We
convert it via the as.mr
method for logical vectors, and
then combine it with race
hispanic<-as.mr(usethnicity$Q4==1, "Hispanic")
ethnicity<-mr_union(race, hispanic)
ethnicity[101:120]
## [1] "Black" "Black" "Black"
## [4] "Black" "AmIndian+Black" "Black"
## [7] "Black" "Black" "Black"
## [10] "Black" "Black" "Black"
## [13] "Black+?Hispanic" "Black" "Black"
## [16] "Black" "Black" "Black"
## [19] "AmIndian+Black+White" "Black"
The plot
method shows co-occurence of the various
race/ethnicity terms
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_bar()`).
Tabulations against other factor or multiple-response variables are
possible with mtable
. Note that mtable
shows
frequencies for each category; use as.character
to get
frequencies for combinations – do not use as.factor
, which
is not generic and so cannot have a mr
method.
## 1 2 <NA>
## AmIndian 242 466 0
## Asian 154 679 0
## Black 612 2000 0
## Pacific 120 256 0
## White 2112 4759 0
## Hispanic 889 2301 0
##
## 1 2
## FALSE 2704 6596
## TRUE 612 2000
##
## 1 2
## FALSE 2878 7015
## TRUE 438 1581
##
## 1 2
## 27 106
## AmIndian 40 65
## AmIndian+Asian 0 1
## AmIndian+Asian+Black 1 2
## AmIndian+Asian+Black+Pacific 0 0
## AmIndian+Asian+Black+Pacific+Hispanic 0 1
## AmIndian+Asian+Black+Pacific+White 3 3
## AmIndian+Asian+Black+Pacific+White+Hispanic 2 6
## AmIndian+Asian+Black+White 2 2
## AmIndian+Asian+Black+White+Hispanic 1 1
## AmIndian+Asian+Hispanic 0 1
## AmIndian+Asian+Pacific+Hispanic 1 0
## AmIndian+Asian+Pacific+White 0 1
## AmIndian+Asian+White 1 5
## AmIndian+Asian+White+Hispanic 0 1
## AmIndian+Black 11 48
## AmIndian+Black+Hispanic 3 5
## AmIndian+Black+Pacific 0 2
## AmIndian+Black+Pacific+Hispanic 0 1
## AmIndian+Black+Pacific+White 1 0
## AmIndian+Black+Pacific+White+Hispanic 1 3
## AmIndian+Black+White 13 20
## AmIndian+Black+White+Hispanic 5 7
## AmIndian+Hispanic 84 188
## AmIndian+Pacific 2 3
## AmIndian+Pacific+Hispanic 0 3
## AmIndian+Pacific+White 1 3
## AmIndian+Pacific+White+Hispanic 0 2
## AmIndian+White 50 71
## AmIndian+White+Hispanic 20 21
## Asian 80 489
## Asian+Black 2 26
## Asian+Black+Hispanic 1 0
## Asian+Black+Pacific 1 2
## Asian+Black+Pacific+Hispanic 0 1
## Asian+Black+Pacific+White 0 0
## Asian+Black+White 3 4
## Asian+Black+White+Hispanic 2 1
## Asian+Hispanic 14 25
## Asian+Pacific 8 15
## Asian+Pacific+Hispanic 3 2
## Asian+Pacific+White 5 9
## Asian+Pacific+White+Hispanic 3 2
## Asian+White 19 71
## Asian+White+Hispanic 2 8
## Black 438 1581
## Black+Hispanic 40 100
## Black+Pacific 2 14
## Black+Pacific+Hispanic 6 6
## Black+Pacific+White 1 4
## Black+Pacific+White+Hispanic 3 1
## Black+White 59 140
## Black+White+Hispanic 11 19
## Hispanic 375 1010
## Pacific 31 72
## Pacific+Hispanic 34 68
## Pacific+White 8 26
## Pacific+White+Hispanic 4 6
## White 1618 3510
## White+Hispanic 274 812