Question

我已经使用dplyr :: read_csv在R中导入了CSV文件。 CSV文件包含变量名，其中许多包含空格。有些变量名称也有数字，例如17、18等。我想将这些变量重命名为更有意义的名称。

Snapshot of my data

例如，我尝试了以下代码：

rename(burkina, enum = Enumerator) 
rename(burkina, enum = `Enumerator`) 
rename(burkina, enum = "Enumerator") 
rename(burkina,test = `17`)

他们似乎都没有工作。相反，我收到以下错误：

make.names（x）中的错误：无效的多字节字符串1

Answer 1

对于这种情况，看门人软件包中的功能clean_names()派上用场。例如：

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

> head(iris %>% janitor::clean_names())
  sepal_length sepal_width petal_length petal_width species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

> head(iris %>% janitor::clean_names(case = "all_caps"))
  SEPAL_LENGTH SEPAL_WIDTH PETAL_LENGTH PETAL_WIDTH SPECIES
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

您可以从一系列目标案例中进行选择，请参见?janitor::clean_names。

Answer 2

最简单的方法是使用像这样的字符向量替换列名：

names(burkina) <- c("de", "draft_date", "submit_date", ...)

或者，您可以使用函数将名称转换为更友好的名称。我使用此功能。

# function to simplify vector of names
ensnakeify <- function(x) {
  x %>%
    iconv(to="ASCII//TRANSLIT") %>% # remove accents
    str_replace_na() %>% # convert NA to string
    str_to_lower() %>% # convert to lower case
    str_replace_all(pattern="%", replacement="pc") %>% # convert % to pc
    str_replace_all(pattern="[^[:alnum:]]", replacement=" ") %>% # convert remaining non-alphanumeric to space
    str_trim() %>% # trim leading and trailing spaces
    str_replace_all(pattern="\\s+", replacement="_") # convert remaining spaces to underscore
}

# function to simplify df column names
autosnake <- function(df){ # to use in pipe
  names(df) <- ensnakeify(names(df))
  df
}

burkina <- read_csv("Filename") %>% autosnake

如何重命名从CSV文件导入的变量？

2 个答案: