name-repair {tibble}R Documentation

Repair the names of a vector

Description

Maturing lifecycle

tibble deals with a few levels of name repair:

universal implies unique, unique implies minimal. These levels are nested.

The .name_repair argument of tibble() and as_tibble() refers to these levels. Alternatively, the user can pass their own name repair function. It should anticipate minimal names as input and should, likewise, return names that are at least minimal.

The existing functions tidy_names(), set_tidy_names(), and repair_names() are soft-deprecated.

minimal names

minimal names exist. The names attribute is not NULL. The name of an unnamed element is "" and never NA.

Examples:

Original names of a vector with length 3: NULL
                           minimal names: "" "" ""

                          Original names: "x" NA
                           minimal names: "x" ""

Request .name_repair = "minimal" to suppress almost all name munging. This is useful when the first row of a data source – allegedly variable names – actually contains data and the resulting tibble is destined for reshaping with, e.g., tidyr::gather().

unique names

unique names are minimal, have no duplicates, and can be used (possibly with backticks) in contexts where a variable is expected. Empty names, and ... or .. followed by a sequence of digits are banned If a data frame has unique names, you can index it by name, and also access the columns by name. In particular, df[["name"]] and df$`name` and also with(df, `name`) always work.

There are many ways to make names unique. We append a suffix of the form ...j to any name that is "" or a duplicate, where j is the position. We also change ..# and ... to ...#.

Example:

Original names:     ""     "x"     "" "y"     "x"  "..2"  "..."
  unique names: "...1" "x...2" "...3" "y" "x...5" "...6" "...7"

Pre-existing suffixes of the form ...j are always stripped, prior to making names unique, i.e. reconstructing the suffixes. If this interacts poorly with your names, you should take control of name repair.

universal names

universal names are unique and syntactic, meaning they:

If a data frame has universal names, variable names can be used "as is" in code. They work well with nonstandard evaluation, e.g., df$name works.

Tibble has a different method of making names syntactic than base::make.names(). In general, tibble prepends one or more dots . until the name is syntactic.

Examples:

 Original names:     ""     "x"    NA      "x"
universal names: "...1" "x...2" "...3" "x...4"

  Original names: "(y)"  "_z"  ".2fa"  "FALSE"
 universal names: ".y." "._z" "..2fa" ".FALSE"

See Also

rlang::names2() returns the names of an object, after making them minimal.

The Names attribute section in the "tidyverse package development principles".

Examples

## Not run: 
## by default, duplicate names are not allowed
tibble(x = 1, x = 2)

## End(Not run)
## you can authorize duplicate names
tibble(x = 1, x = 2, .name_repair = "minimal")
## or request that the names be made unique
tibble(x = 1, x = 2, .name_repair = "unique")

## by default, non-syntactic names are allowed
df <- tibble(`a 1` = 1, `a 2` = 2)
## because you can still index by name
df[["a 1"]]
df$`a 1`

## syntactic names are easier to work with, though, and you can request them
df <- tibble(`a 1` = 1, `a 2` = 2, .name_repair = "universal")
df$a.1

## you can specify your own name repair function
tibble(x = 1, x = 2, .name_repair = make.unique)

fix_names <- function(x) gsub("%", " percent", x)
tibble(`25%` = 1, `75%` = 2, .name_repair = fix_names)

fix_names <- function(x) gsub("\\s+", "_", x)
tibble(`year 1` = 1, `year 2` = 2, .name_repair = fix_names)

## purrr-style anonymous functions and constants
## are also supported
tibble(x = 1, x = 2, .name_repair = ~ make.names(., unique = TRUE))

tibble(x = 1, x = 2, .name_repair = ~ c("a", "b"))

## the names attibute will be non-NULL, with "" as the default element
df <- as_tibble(list(1:3, letters[1:3]), .name_repair = "minimal")
names(df)

[Package tibble version 2.1.3 Index]