在R中使用“整齐”方法重组具有多种标头类型的数据帧

时间:2018-02-10 04:42:40

标签: r tidyr tidyverse

我的数据框看起来有点像这样:

Age  A1U_sweet  A2F_dip  A3U_bbq  C1U_sweet  C2F_dip  C3U_bbq  Comments
23   1          2        1        NA         NA       NA       Good
54   NA         NA       NA       4          1        2        ABCD
43   2          4        7        NA         NA       NA       HiHi

我正在尝试按照下面显示的方式对其进行重新组织,以使其更“整洁”。有没有办法让我这样做,还包括Age和Comments列,其风格与下面其他变量所示的相同?您如何建议合并它们 - 下面显示了一个想法,但我对其他建议持开放态度。如何修改以下代码以便考虑多种不同样式的列名?

library(tidyr)

df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
##   id kind  type     A    C
## 1  1  Age   Age    23   23
## 2  1  1U_ sweet     1   NA
## 3  1  2F_   dip     2   NA
## 4  1  3U_   bbq     1   NA
## 5  1  Com   Com  Good Good
## 6  2  Age   Age    54   54
## 7  2  1U_ sweet    NA    4
## 8  2  2F_   dip    NA    1
## 9  2  3U_   bbq    NA    2
##10  2  Com   Com  ABCD ABCD
##11  3  Age   Age    43   43
##12  3  1U_ sweet     2   NA
##13  3  2F_   dip     4   NA
##14  3  3U_   bbq     7   NA
##15  3  Com   Com  HiHi HiHi

我如何修改以下代码以将数据恢复回原来的状态?

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)

就上下文而言,此问题是由Ista在此问题下的评论提示的:Combining columns in R based on matching beginnings of column title names

1 个答案:

答案 0 :(得分:0)

由于AgeComments大概是在原始数据中的任何一行的水平上进行测量,因此只需将它们带到骑行中:

df <- data.frame(id = 1:nrow(df), df)

dfl <- gather(df, key = "key", value = "value", -id, -Age, -Comments)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2

df2 <- transform(df2, B = ifelse(is.na(A), C, A))
df2

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
df