在Dataframe列表中为Dataframe名称命名列

时间:2018-06-06 10:11:05

标签: r list dataframe rename

目标:将数据框列表中数据框的colname更改为每个数据框的名称。

在处理有关其名称的列表和数据框时,我遇到了一些问题。我准备这个例子来澄清。希望它不是一团糟。

数据:

df1 <- data.frame(A = 1, B = 2, C = 3)
df2 <- data.frame(A = 3, B = 3, C = 2)
dfList <- list(df1,df2)

输出:

> str(dfList)
List of 2
 $ :'data.frame':   1 obs. of  3 variables:
  ..$ A: num 1
  ..$ B: num 2
  ..$ C: num 3
 $ :'data.frame':   1 obs. of  3 variables:
  ..$ A: num 3
  ..$ B: num 3
  ..$ C: num 2
> names(dfList)
NULL
> names(dfList$df1)
NULL
> names(dfList$df2)
NULL

手动输入名称:

names(dfList) <- c("df1", "df2") 
dfList <- lapply(dfList, setNames, c("A", "B", "C")) 

产生:

> str(dfList)
List of 2
 $ df1:'data.frame':    1 obs. of  3 variables:
  ..$ A: num 1
  ..$ B: num 2
  ..$ C: num 3
 $ df2:'data.frame':    1 obs. of  3 variables:
  ..$ A: num 3
  ..$ B: num 3
  ..$ C: num 2
> names(dfList)
[1] "df1" "df2"
> names(dfList$df1)
[1] "A" "B" "C"
> names(dfList$df2)
[1] "A" "B" "C"

期望的解决方案:

WishedList <- dfList
WishedList[[1]] <- setNames(WishedList[[1]], c("A", "B", "df1"))
WishedList[[2]] <- setNames(WishedList[[2]], c("A", "B", "df2"))

输出解决方案:

> str(WishedList)
List of 2
 $ df1:'data.frame':    1 obs. of  3 variables:
  ..$ A  : num 1
  ..$ B  : num 2
  ..$ df1: num 3
 $ df2:'data.frame':    1 obs. of  3 variables:
  ..$ A  : num 3
  ..$ B  : num 3
  ..$ df2: num 2
> names(WishedList)
[1] "df1" "df2"
> names(WishedList$df1)
[1] "A"   "B"   "df1"
> names(WishedList$df2)
[1] "A"   "B"   "df2"

MyTry:

TryList1 <- lapply(dfList, function(x) setNames(x, c("A", "B", quote(x))))
str(TryList1)
List of 2
 $ df1:'data.frame':    1 obs. of  3 variables:
  ..$ A: num 1
  ..$ B: num 2
  ..$ x: num 3
 $ df2:'data.frame':    1 obs. of  3 variables:
  ..$ A: num 3
  ..$ B: num 3
  ..$ x: num 2

疑惑:

1)为什么在创建文件时,列表中不包含数据框和数据框的列的名称?

2)引用(x)与单个数据帧一起工作。为什么不在列表中?

> df1 <- data.frame(A = 1, B = 2, C = 3)
> df1 <- setNames(df1, c("A", "B", quote(df1)))
> names(df1)
[1] "A"   "B"   "df1"

非常感谢!

2 个答案:

答案 0 :(得分:4)

这是一种略有不同的方法:

df1 <- data.frame(A = 1, B = 2, C = 3)
df2 <- data.frame(A = 3, B = 3, C = 2)
dfList <- list(df1,df2)
names(dfList) <- c("df1", "df2") 

Map(function(df, dfn) {names(df)[3] <- dfn; df}, dfList, names(dfList))

#$df1
#  A B df1
#1 1 2   3
#
#$df2
#  A B df2
#1 3 3   2

您也可以在setNames(df, c("A", "B", dfn))函数中使用mapply

关于OP试用的说明:quote州的文档:

  

引用只返回其参数。

这就是为什么当你在quote(x)内使用lapply时,它只会返回字符x

答案 1 :(得分:1)

我们可以lapply()而非names(dfList)而不是dfList

lapply(names(dfList), function(dfn) {
  df <- dfList[[dfn]]
  names(df)[3] <- dfn
  df
})

# [[1]]
#   A B df1
# 1 1 2   3
# 
# [[2]]
#   A B df2
# 1 3 3   2

purrr中有一个便利功能,可以同时映射列表及其名称:

library(purrr)

imap(dfList, ~ {
  names(.x)[3] <- .y
  .x
})

# $df1
#   A B df1
# 1 1 2   3
# 
# $df2
#   A B df2
# 1 3 3   2

或者,如果您只是简短的单行并且不介意硬编码"A""B"

imap(dfList, ~ setNames(.x, c("A", "B", .y)))

(注意:基本上这些只是围绕Docendo discimus回答的变体)。

此外,不是您的预期输出,但可能对您感兴趣:

dplyr::bind_rows(dfList, .id = "origin")

#   origin A B C
# 1    df1 1 2 3
# 2    df2 3 3 2

或者:

bind_rows(map(dfList, select, -C), .id = "C")

#     C A B
# 1 df1 1 2
# 2 df2 3 3