| Title: | Responses in Multiplex |
|---|---|
| Description: | Tools for manipulating, exploring, and visualising multiple-response data, including scored or ranked responses. Conversions to and from factors, lists, strings, matrices; reordering, lumping, flattening; set operations; tables; frequency and co-occurrence plots. |
| Authors: | Thomas Lumley [aut, cre], Annie Cohen [ctb] |
| Maintainer: | Thomas Lumley <[email protected]> |
| License: | GPL-3 |
| Version: | 0.7 |
| Built: | 2026-05-14 06:10:26 UTC |
| Source: | https://github.com/tslumley/rimu |
Constructs mr objects representing multiple-choice questions where more than one choice is allowed.
as.mr(x, ...) ## S3 method for class 'logical' as.mr(x,name=deparse(substitute(x)),...) ## S3 method for class 'list' as.mr(x, sort.levels=TRUE,...,levels=NULL) ## S3 method for class 'factor' as.mr(x, sort.levels=FALSE,...) ## S3 method for class 'data.frame' as.mr(x, sort.levels=FALSE,...,na.rm=TRUE) ## S3 method for class 'character' as.mr(x, sep=", ", sort.levels=TRUE,..., levels=NULL) ## Default S3 method: as.mr(x, sort.levels=TRUE, levels=unique(x),...) ## S3 method for class 'ms' as.mr(x,...)as.mr(x, ...) ## S3 method for class 'logical' as.mr(x,name=deparse(substitute(x)),...) ## S3 method for class 'list' as.mr(x, sort.levels=TRUE,...,levels=NULL) ## S3 method for class 'factor' as.mr(x, sort.levels=FALSE,...) ## S3 method for class 'data.frame' as.mr(x, sort.levels=FALSE,...,na.rm=TRUE) ## S3 method for class 'character' as.mr(x, sep=", ", sort.levels=TRUE,..., levels=NULL) ## Default S3 method: as.mr(x, sort.levels=TRUE, levels=unique(x),...) ## S3 method for class 'ms' as.mr(x,...)
x |
Object to be converted to class |
... |
for compatibility; not used |
sort.levels |
put the levels of the |
levels |
optional character vector of the permitted levels |
name |
level name (for a vector) or vector of level names to replace the column names (for a matrix) |
na.rm |
If |
sep |
Regular expression for splitting the string |
The internal representation of mr objects is as a logical matrix
with the levels as column names.
The method for logical x coerces a single vector to a one-column
matrix, and then applies the name argument as the column
name. Given a matrix, the name argument is optional and replaces
the existing column names; the default is not used.
The method for list x takes a list of character vectors that
represent the levels present for one observation. The method for strings splits the string at the supplied separator and then uses the list method.
The method for factor x produces an mr object with the
factor levels as levels. Each observation will have only one value.
The data.frame object works for logical or numeric columns of a
data frame. Zero or negative values are treated as 'not present',
positive values as 'present'. Optionally, NA values are coded as
'not present', which is useful when the data frame was created by
reshape or dplyr::spread.
The method for ms objects simply drops the score/rank information
Object of class mr
nzbirds_list<-list(c("kea","tui"), c("kea","ruru","kaki"), c("ruru"), c("tui","ruru"), c("tui","kea","ruru"), c("tauhou","kea")) nzbirds_list as.mr(nzbirds_list) as.mr(c("kea, tui","kea, ruru, kaki","ruru","tui, ruru")) data(nzbirds) nzbirds as.mr(nzbirds) data(ethnicity) ethnicity as.logical(ethnicity) as.mr(as.logical(ethnicity))nzbirds_list<-list(c("kea","tui"), c("kea","ruru","kaki"), c("ruru"), c("tui","ruru"), c("tui","kea","ruru"), c("tauhou","kea")) nzbirds_list as.mr(nzbirds_list) as.mr(c("kea, tui","kea, ruru, kaki","ruru","tui, ruru")) data(nzbirds) nzbirds as.mr(nzbirds) data(ethnicity) ethnicity as.logical(ethnicity) as.mr(as.logical(ethnicity))
The internal representation is as a numeric matrix with 0 when a level is not present and the non-zero rank or score when it is present. The data.frame and matrix methods uses the numeric values of x, and by default set NA values to 'not present'. The list method takes a list with a character vector for each observation and uses the position in the list as the rank/score. The character method splits the string at the separators to give a list and uses the list method.
The mr method uses a score of 1 whenever the level is present.
as.ms(x, ...) ## S3 method for class 'list' as.ms(x,...,levels=NULL) ## S3 method for class 'data.frame' as.ms(x,...,na.rm=TRUE) ## S3 method for class 'matrix' as.ms(x,...,na.rm=TRUE) ## S3 method for class 'mr' as.ms(x,...) ## S3 method for class 'character' as.ms(x,sep=", ", ...,levels=NULL)as.ms(x, ...) ## S3 method for class 'list' as.ms(x,...,levels=NULL) ## S3 method for class 'data.frame' as.ms(x,...,na.rm=TRUE) ## S3 method for class 'matrix' as.ms(x,...,na.rm=TRUE) ## S3 method for class 'mr' as.ms(x,...) ## S3 method for class 'character' as.ms(x,sep=", ", ...,levels=NULL)
x |
object to be converted |
... |
for compatibility; not used. |
levels |
Optional character vector giving the permitted levels |
na.rm |
Convert |
sep |
Regular expression for splitting the character string |
Object of class ms
nzbirds_list<-list(c("kea","tui"), c("kea","ruru","kaki"), c("ruru"), c("tui","ruru"), c("tui","kea","ruru"), c("tauhou","kea")) nzbirds_list (msbirds<-as.ms(nzbirds_list)) (bird_mat <- unclass(msbirds)) as.ms(bird_mat)nzbirds_list<-list(c("kea","tui"), c("kea","ruru","kaki"), c("ruru"), c("tui","ruru"), c("tui","kea","ruru"), c("tauhou","kea")) nzbirds_list (msbirds<-as.ms(nzbirds_list)) (bird_mat <- unclass(msbirds)) as.ms(bird_mat)
The vmr class wraps the mr class using the vctrs package, for compatibility with tidyverse tbl_df objects (tibbles).
as.vmr(x, ...) new_vmr(x, levels = unique(do.call(c, x)))as.vmr(x, ...) new_vmr(x, levels = unique(do.call(c, x)))
x |
For |
... |
not used |
levels |
the permitted levels for the object |
These objects need the vctrs and pillar packages to work, and need the tibble package to be useful.
An object of class vmr
The internals vignette for internal structure
if (requireNamespace("vctrs", quietly=TRUE)){ data(nzbirds) nzbirds tidybirds<-as.vmr(nzbirds, na.rm=TRUE) tidybirds }if (requireNamespace("vctrs", quietly=TRUE)){ data(nzbirds) nzbirds tidybirds<-as.vmr(nzbirds, na.rm=TRUE) tidybirds }
Counts of observations for 12 bird species by US county and Canadian province in the Great Backyard Bird survey. These birds were randomly sampled from the much larger number in the full data set. See the vignette for more details.
data("birds")data("birds")
A data frame with 3046 observations on the following 13 variables.
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
a numeric vector
locationa character vector
data(birds) birds<-as.ms(birds[,1:12],na.rm=TRUE) mtable(as.mr(birds))data(birds) birds<-as.ms(birds[,1:12],na.rm=TRUE) mtable(as.mr(birds))
The statistical standard for collecting ethnicity data requires that respondents can mark all that are applicable. The level 1 values are "Māori", "Pacific Peoples" (ie, Pacific Island ethnicities), "Asian", "European", and "MELAA" (Middle Eastern, Latin American, and African). This is artificial data
data("ethnicity")data("ethnicity")
An object of class mr
data(ethnicity) ethnicitydata(ethnicity) ethnicity
These perform diverse useful tasks. mr_count counts the number of levels present for each individual. mr_na sets NA values to something else, ms_na sets them to 0 (ie, not present),
mr_drop and ms_drop drop some levels from the object.
mr_count(x, na.rm = TRUE) mr_drop(x, levels,...) ms_drop(x, levels) mr_na(x, na=TRUE) ms_na(x)mr_count(x, na.rm = TRUE) mr_drop(x, levels,...) ms_drop(x, levels) mr_na(x, na=TRUE) ms_na(x)
x |
|
na.rm |
Remove |
levels |
character vector of levels to remove |
na |
Value ( |
... |
not used |
An integer vector for mr_count, an object of class mr, or ms for the other two functions
data(usethnicity) race<-as.mr(strsplit(as.character(usethnicity$Q5),"")) mtable(race) race<-mr_drop(race,c(" ","F","G","H")) mtable(race) ## to keep just specified levels use [ mtable(race[,c("A","D")]) ## How many do people identify with table(mr_count(race)) data(nzbirds) seenbirds<-as.mr(nzbirds>0) countbirds<-mr_count(seenbirds) ## How many types of birds were seen table(countbirds) data(ethnicity) ethnicity mr_na(ethnicity, FALSE)data(usethnicity) race<-as.mr(strsplit(as.character(usethnicity$Q5),"")) mtable(race) race<-mr_drop(race,c(" ","F","G","H")) mtable(race) ## to keep just specified levels use [ mtable(race[,c("A","D")]) ## How many do people identify with table(mr_count(race)) data(nzbirds) seenbirds<-as.mr(nzbirds>0) countbirds<-mr_count(seenbirds) ## How many types of birds were seen table(countbirds) data(ethnicity) ethnicity mr_na(ethnicity, FALSE)
Convert a multiple-response object into a factor using a supplied ordering. Each observation is assigned its first level in the ordering. That is, an observation that has priorities[1] as one of its levels is assigned that value. An observation that does not priorities[1] as one of its levels, but does have priorities[2] is assigned priorities[2].
mr_flatten(x, priorities, sort=FALSE)mr_flatten(x, priorities, sort=FALSE)
x |
|
priorities |
Character vector of levels. |
sort |
if |
A factor
data(ethnicity) ethnicity ## NZ 'prioritised ethnicity' priority<-c("Maori", "Pacific", "Asian", "European/Other") eth <- mr_na(mr_recode(ethnicity, `European/Other`="European", `European/Other` = "MELAA"), FALSE) mr_flatten(eth, priority) mr_flatten(eth, priority, sort=TRUE)data(ethnicity) ethnicity ## NZ 'prioritised ethnicity' priority<-c("Maori", "Pacific", "Asian", "European/Other") eth <- mr_na(mr_recode(ethnicity, `European/Other`="European", `European/Other` = "MELAA"), FALSE) mr_flatten(eth, priority) mr_flatten(eth, priority, sort=TRUE)
mr_inorder and ms_inorder use the order in which the
levels first appear in the data (which is invariant to locale),mr_inseq and
ms_inseq sort alphabetically (for the current locale). mr_infreq sorts by frequency, and ms_inscore applies a function to the values in each level – one such function is mean0, which takes the mean of non-zero values. Finally, ms_reorder and mr_reorder use some function of a second variable computed on the observations where each level is present.
mr_inorder(x,...) ms_inorder(x) mr_inseq(x,...) ms_inseq(x) mr_infreq(x,na.rm=TRUE,...) ms_infreq(x) ms_inscore(x, fun=mean0) mean0(y) mr_reorder(x, v, fun=median,...) ms_reorder(x, v, fun=median)mr_inorder(x,...) ms_inorder(x) mr_inseq(x,...) ms_inseq(x) mr_infreq(x,na.rm=TRUE,...) ms_infreq(x) ms_inscore(x, fun=mean0) mean0(y) mr_reorder(x, v, fun=median,...) ms_reorder(x, v, fun=median)
x |
|
na.rm |
Remove |
v, fun
|
Sort levels of |
y |
numeric vector |
... |
not used |
Object of class mr
These are based on the reordering functions for factors in the
forcats package.
data(ethnicity) mr_infreq(ethnicity) mr_inseq(ethnicity) data(nzbirds) mtable(nzbirds) mtable(ms_inorder(nzbirds)) mtable(ms_inseq(nzbirds)) mtable(ms_inscore(nzbirds, mean0))data(ethnicity) mr_infreq(ethnicity) mr_inseq(ethnicity) data(nzbirds) mtable(nzbirds) mtable(ms_inorder(nzbirds)) mtable(ms_inseq(nzbirds)) mtable(ms_inscore(nzbirds, mean0))
Combine the least common or most common levels of a mr object into an "other" level.
mr_lump(x, n, prop, other_level = "Other", ties.method = c("min", "average", "first", "last", "random", "max"),...)mr_lump(x, n, prop, other_level = "Other", ties.method = c("min", "average", "first", "last", "random", "max"),...)
x |
Object of class |
n |
Positive integer to keep the most common |
prop |
Positive prop preserves values that appear at least prop of the time. Negative prop preserves values that appear at most -prop of the time. |
other_level |
Label for the lumped levels |
ties.method |
How to handle ties. Passed to |
... |
not used |
An object of class mr
Based on fct_lump from the forcats package.
data(ethnicity) mtable(ethnicity) mtable(mr_lump(ethnicity,2)) mtable(mr_lump(ethnicity,-2)) data(rstudiosurvey) ## Other software being used other_software<- as.mr(rstudiosurvey[[40]]) mtable(other_software) ## The top 20 responses common<-mr_lump(other_software, n=20) mtable(common) ## 'None' isn't really another package mtable(mr_drop(common,"None")) ## Packages with at least 20% use mtable(mr_lump(other_software, prop=0.2))data(ethnicity) mtable(ethnicity) mtable(mr_lump(ethnicity,2)) mtable(mr_lump(ethnicity,-2)) data(rstudiosurvey) ## Other software being used other_software<- as.mr(rstudiosurvey[[40]]) mtable(other_software) ## The top 20 responses common<-mr_lump(other_software, n=20) mtable(common) ## 'None' isn't really another package mtable(mr_drop(common,"None")) ## Packages with at least 20% use mtable(mr_lump(other_software, prop=0.2))
Relabel some or all of the levels of a multiple-response object. Two levels that are recoded to the same value will be combined.
mr_recode(x, ...)mr_recode(x, ...)
x |
Object of class |
... |
new names in the form |
New object of class mr, ms
data(nzbirds) nzbirds<-as.mr(nzbirds) nzbirds ## recode to English names mr_recode(nzbirds,morepork="ruru",stilt="kaki",waxeye="tauhou") data(usethnicity) race<-as.mr(usethnicity$Q5,"") race<-mr_drop(race,c(" ","F","G","H")) race <- mr_recode(race, AmIndian="A",Asian="B", Black="C", Pacific="D", White="E") mtable(race)data(nzbirds) nzbirds<-as.mr(nzbirds) nzbirds ## recode to English names mr_recode(nzbirds,morepork="ruru",stilt="kaki",waxeye="tauhou") data(usethnicity) race<-as.mr(usethnicity$Q5,"") race<-mr_drop(race,c(" ","F","G","H")) race <- mr_recode(race, AmIndian="A",Asian="B", Black="C", Pacific="D", White="E") mtable(race)
Creates a data frame where every observation has as many rows as it has levels present, plus an id column to specify which rows go together.
mr_stack(x, ..., na.rm = FALSE) ms_stack(x, ..., na.rm = FALSE)mr_stack(x, ..., na.rm = FALSE) ms_stack(x, ..., na.rm = FALSE)
x |
multiple response object |
... |
other multiple response objects |
na.rm |
drop |
A data frame with columns values and id, plus a column scores if x is a ms object. When more than one object is supplied, the result is an outer join of the two indindividual results, so it contains a row for every combination of an observed value from each object.
data(ethnicity) ethnicity mr_stack(ethnicity) data(nzbirds) nzbirds ms_stack(nzbirds) ## not actually a sensible use d <- mr_stack(ethnicity, nzbirds) head(d) with(d, table(ethnicity, nzbirds)) ## equivalent, but more efficient mtable(mr_na(ethnicity), mr_na(nzbirds))data(ethnicity) ethnicity mr_stack(ethnicity) data(nzbirds) nzbirds ms_stack(nzbirds) ## not actually a sensible use d <- mr_stack(ethnicity, nzbirds) head(d) with(d, table(ethnicity, nzbirds)) ## equivalent, but more efficient mtable(mr_na(ethnicity), mr_na(nzbirds))
These functions take union, intersection, and difference of two multiple-response objects. An observation has a level in the union if it has that level in either input. It has the level in the intersection if it has the level in both inputs. It has the level in the difference if it has the level in x and not in y
mr_union(x, y,...) mr_intersect(x, y,...) mr_diff(x, y,...)mr_union(x, y,...) mr_intersect(x, y,...) mr_diff(x, y,...)
x, y
|
Objects of class |
... |
not used |
Object of class mr
data(usethnicity) race<-as.mr(usethnicity$Q5,"") race<-mr_drop(race,c(" ","F","G","H")) race <- mr_recode(race, AmIndian="A",Asian="B", Black="C", Pacific="D", White="E") mtable(race) hispanic<-as.mr(usethnicity$Q4==1, "Hispanic") ethnicity<-mr_union(race, hispanic) mtable(ethnicity) ethnicity[101:120]data(usethnicity) race<-as.mr(usethnicity$Q5,"") race<-mr_drop(race,c(" ","F","G","H")) race <- mr_recode(race, AmIndian="A",Asian="B", Black="C", Pacific="D", White="E") mtable(race) hispanic<-as.mr(usethnicity$Q4==1, "Hispanic") ethnicity<-mr_union(race, hispanic) mtable(ethnicity) ethnicity[101:120]
Returns vector of TRUE or FALSE according to whether y is onle of the levels present for that row or is the only level present for that row.
x %has% y x %hasonly% y x %hasall% ys x %hasany% ysx %has% y x %hasonly% y x %hasall% ys x %hasany% ys
x |
|
y |
character vector specifying a level |
ys |
character vector specifying one or more levels |
Logical vector
data(ethnicity) ethnicity ethnicity %has% "Maori" ethnicity %hasonly% "Maori" data(nzbirds) as.mr(nzbirds) as.mr(nzbirds)data(ethnicity) ethnicity ethnicity %has% "Maori" ethnicity %hasonly% "Maori" data(nzbirds) as.mr(nzbirds) as.mr(nzbirds)
Convert a multiple-response object into a named numeric vector using a supplied ordering.
ms_flatten(x, priorities, fun, start=0)ms_flatten(x, priorities, fun, start=0)
x |
|
priorities |
Character vector of levels. |
fun |
Function for reducing two values to one. |
start |
starting value for |
Each observation is initially assigned the value start. Starting with the lowest-priority level, the current value is combined with the new value as fun(new, current). Using fun=function(x,y) x would return the value for the highest-priority level present; using fun=pmax would return the highest score for any level present; using fun="+" would return the sum of the scores.
A factor
data(ethnicity) ethnicity ## NZ 'prioritised ethnicity' eth <- mr_recode(ethnicity, `European/Other`="European", `European/Other` = "MELAA") mr_flatten(ethnicity, c("Maori","Pacific","Asian","European/Other")) data(nzbirds) ## hardest to see first ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"),"+") ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"), fun=function(x,y) x) ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"),pmin,start=Inf)data(ethnicity) ethnicity ## NZ 'prioritised ethnicity' eth <- mr_recode(ethnicity, `European/Other`="European", `European/Other` = "MELAA") mr_flatten(ethnicity, c("Maori","Pacific","Asian","European/Other")) data(nzbirds) ## hardest to see first ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"),"+") ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"), fun=function(x,y) x) ms_flatten(nzbirds, c("kaki","ruru","kea","tui","tauhou"),pmin,start=Inf)
Creates one-way and two-way tables using every level of a multiple response object. Use table(as.character(x)) to tabulate combinations of levels
mtable(x, y, na.rm = TRUE)mtable(x, y, na.rm = TRUE)
x |
|
y |
|
na.rm |
remove missing values? |
A 1-d or 2-d array with names giving the levels
data(ethnicity) mtable(ethnicity) table(as.character(ethnicity)) data(nzbirds) nzbirds<-as.mr(nzbirds) ## co-occurence table mtable(nzbirds, nzbirds) ## table by a factor v<-rep(c("A","B"),3) mtable(nzbirds,v) data(nzbirds) mtable(nzbirds>0)data(ethnicity) mtable(ethnicity) table(as.character(ethnicity)) data(nzbirds) nzbirds<-as.mr(nzbirds) ## co-occurence table mtable(nzbirds, nzbirds) ## table by a factor v<-rep(c("A","B"),3) mtable(nzbirds,v) data(nzbirds) mtable(nzbirds>0)
A small artifical dataset that could be produced by asking people to name New Zealand birds. Each observation has scores from 1 (first bird named) to at most 4 (fourth bird named).
data("nzbirds")data("nzbirds")
A ms object with 6 observations on the following 5 variables.
keaa numeric vector
rurua numeric vector
tuia numeric vector
tauhoua numeric vector
kakia numeric vector
data(nzbirds) nzbirds as.mr(nzbirds)data(nzbirds) nzbirds as.mr(nzbirds)
The plot method for mr objects is an UpSet plot, showing co-occurences of the various categories. The image method is a heatmap of the variable plotted against itself with mtable.
## S3 method for class 'mr' plot(x, ...) ## S3 method for class 'mr' image(x, type = c("overlap", "conditional", "association", "raw"), ...) ## S3 method for class 'mr' barplot(height,...)## S3 method for class 'mr' plot(x, ...) ## S3 method for class 'mr' image(x, type = c("overlap", "conditional", "association", "raw"), ...) ## S3 method for class 'mr' barplot(height,...)
x |
|
type |
|
height |
|
... |
Passed to |
Used for its side effect
data(rstudiosurvey) other_software<- as.mr(rstudiosurvey[[40]]) ## only those with at least 20 responses common<-mr_lump(other_software, n=20) common<-mr_drop(common, "None") ## UpSet plot plot(common) ## images image(common, type="conditional") image(common, type="association")data(rstudiosurvey) other_software<- as.mr(rstudiosurvey[[40]]) ## only those with at least 20 responses common<-mr_lump(other_software, n=20) common<-mr_drop(common, "None") ## UpSet plot plot(common) ## images image(common, type="conditional") image(common, type="association")
The 'rstudiosurvey' data set contains 1838 rows of responses from the 2019 RStudio Community Survey, where columns are the 51 questions and a column for the timestamp. The variable names are the full questions. Multiple responses are separated by a comma and space. Non-ASCII characters have been converted with the "ASCII//TRANSLIT" option of iconv.
data("rstudiosurvey")data("rstudiosurvey")
A data frame with 1838 observations on the following 52 variables.
Timestampa character vector
a character vector
a numeric vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a numeric vector
a character vector
a numeric vector
a character vector
a character vector
a character vector
a character vector
a character vector
a numeric vector
a numeric vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a numeric vector
a numeric vector
a character vector
a character vector
a character vector
a numeric vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a character vector
a numeric vector
a character vector
a character vector
https://github.com/rstudio/r-community-survey/tree/master/2019
data(rstudiosurvey) names(rstudiosurvey)[40] ## Other software being used other_software<- as.mr(rstudiosurvey[[40]]) mtable(other_software) ## top 20 responses common<-mr_lump(other_software, n=20) mtable(common) ## 'None' isn't really another package common<-mr_drop(common, "None") mtable(common) ## UpSet plot plot(common) ## Excel users filled in the survey later timestamp<-as.Date(rstudiosurvey[[1]],format="%m/%d/%y") boxplot(timestamp~I(common %has% "Excel")) ## names in order of popularity t<-mtable(common) popular<-colnames(t)[order(t,decreasing=TRUE)] ## most popular package for each user cuml_users <- mr_flatten(common, popular, sort=TRUE) class(cuml_users) table(cuml_users) ## two-way tables ## people who also use Stata or Julia are less happy with R than those who don't names(rstudiosurvey)[18] happy<-factor(rstudiosurvey[[18]]) mtable(happy, common) round(prop.table(mtable(happy,common),2),2) ## mr objects can be dataframe columns, or expanded to individual levels df<-data.frame(timestamp, happy, common) dim(df) head(df) df_raw<-data.frame(timestamp, happy, as.matrix(common)) dim(df_raw) head(df_raw)data(rstudiosurvey) names(rstudiosurvey)[40] ## Other software being used other_software<- as.mr(rstudiosurvey[[40]]) mtable(other_software) ## top 20 responses common<-mr_lump(other_software, n=20) mtable(common) ## 'None' isn't really another package common<-mr_drop(common, "None") mtable(common) ## UpSet plot plot(common) ## Excel users filled in the survey later timestamp<-as.Date(rstudiosurvey[[1]],format="%m/%d/%y") boxplot(timestamp~I(common %has% "Excel")) ## names in order of popularity t<-mtable(common) popular<-colnames(t)[order(t,decreasing=TRUE)] ## most popular package for each user cuml_users <- mr_flatten(common, popular, sort=TRUE) class(cuml_users) table(cuml_users) ## two-way tables ## people who also use Stata or Julia are less happy with R than those who don't names(rstudiosurvey)[18] happy<-factor(rstudiosurvey[[18]]) mtable(happy, common) round(prop.table(mtable(happy,common),2),2) ## mr objects can be dataframe columns, or expanded to individual levels df<-data.frame(timestamp, happy, common) dim(df) head(df) df_raw<-data.frame(timestamp, happy, as.matrix(common)) dim(df_raw) head(df_raw)
This data set contains variables on race and ethnic identification from the 2017 Youth Risk Behaviour Survey, together with two variables on smoking behaviour. The YRBS is a multistage cluster-sampled survey, so valid inference about associations requires using survey design information. This subset is useful only for demonstration purposes.
data("usethnicity")data("usethnicity")
A data frame with 14765 observations on the following 4 variables.
Q41 is "Hispanic or Latino
Q5Character string with zero or more of: A. American Indian or Alaska Native, B. Asian, C. Black or African American, D. Native Hawaiian or Other Pacific Islander, E. White
QN301 is "smoked cigarettes on one or more of the past 30 days"
QN311 is "smoked more than 10 cigarettes per day on the days they smoked during the past 30 days", those who did not smoke at all are NA
https://www.cdc.gov/healthyyouth/data/yrbs/data.htm
data(usethnicity) race<-as.mr(strsplit(as.character(usethnicity$Q5),"")) race<-mr_drop(race," ") mtable(race) hispanic<-as.mr(usethnicity$Q4==1,"Hispanic") ethnicity<-mr_union(race,hispanic) ethnicity[101:120]data(usethnicity) race<-as.mr(strsplit(as.character(usethnicity$Q5),"")) race<-mr_drop(race," ") mtable(race) hispanic<-as.mr(usethnicity$Q4==1,"Hispanic") ethnicity<-mr_union(race,hispanic) ethnicity[101:120]
Similar to mr_stack, but it works on whole data frames
rather than individual vectors, expanding a multiple-response variable
to a factor with a record for each observed response. The algorithm is
to convert the object to a logical matrix, pivot_longer, then
filter on the logical value.
vmr_stack(data, col, names_to = "level")vmr_stack(data, col, names_to = "level")
data |
a tibble |
col |
the name of a multiple-response column in |
names_to |
the desired output name of the variable with the stacked levels |
An expanded data frame
data(ethnicity) if(require("tidyr",quietly=TRUE)){ t<-tibble(a=LETTERS[1:6], e=as.vmr(ethnicity,na.rm=TRUE)) t t |> vmr_stack(e,names_to="ethnicity") }data(ethnicity) if(require("tidyr",quietly=TRUE)){ t<-tibble(a=LETTERS[1:6], e=as.vmr(ethnicity,na.rm=TRUE)) t t |> vmr_stack(e,names_to="ethnicity") }