selection_language {ipumsr} | R Documentation |
tidyselect selection language in ipumsr
Description
Slightly modified implementation of tidyselect selection language in ipumsr.
Syntax
In general, the selection language in ipumsr operates the same as in tidyselect.
Where applicable, variables can be selected with:
A character vector of variable names (
c("var1", "var2")
)A bare vector of variable names (
c(var1, var2)
)A selection helper from tidyselect (
starts_with("var")
). See below for a list of helpers.
Primary differences
tidyselect selection is generally intended for use with column variables in data.frame-like objects. In contrast, ipumsr allows selection language syntax in other cases as well (for instance, when selecting files from within a .zip archive). ipumsr functions will indicate whether they support the selection language.
Selection with
where()
is not consistently supported.
Selection helpers (from tidyselect)
-
var1
:var10
: variables lying betweenvar1
on the left andvar10
on the right. -
starts_with("a")
: names that start with"a"
-
ends_with("z")
: names that end with"z"
-
contains("b")
: names that contain"b"
-
matches("x.y")
: names that match regular expressionx.y
-
num_range(x, 1:4)
: names following the patternx1, x2, ..., x4
-
all_of(vars)
/any_of(vars)
: matches names stored in the character vectorvars
.all_of(vars)
will error if the variables aren't present;any_of(vars)
will match just the variables that exist. -
everything()
: all variables -
last_col()
: furthest column to the right
Operators for combining those selections:
-
!selection
: only variables that don't matchselection
-
selection1 & selection2
: only variables included in bothselection1
andselection2
-
selection1 | selection2
: all variables that match eitherselection1
orselection2
Examples
cps_file <- ipums_example("cps_00157.xml")
# Load 3 variables by name
read_ipums_micro(
cps_file,
vars = c("YEAR", "MONTH", "PERNUM"),
verbose = FALSE
)
# "Bare" variables are supported
read_ipums_micro(
cps_file,
vars = c(YEAR, MONTH, PERNUM),
verbose = FALSE
)
# Standard tidyselect selectors are also supported
read_ipums_micro(cps_file, vars = starts_with("ASEC"), verbose = FALSE)
# Selection methods can be combined
read_ipums_micro(
cps_file,
vars = c(YEAR, MONTH, contains("INC")),
verbose = FALSE
)
read_ipums_micro(
cps_file,
vars = starts_with("S") & ends_with("P"),
verbose = FALSE
)
# Other selection arguments also support this syntax.
# For instance, load a particular file based on a tidyselect match:
read_nhgis(
ipums_example("nhgis0731_csv.zip"),
file_select = contains("nominal_state"),
verbose = FALSE
)