将数据列转换为R中的许多列

时间:2018-08-16 16:08:00

标签: r dataframe transform reduce transpose

我有如下数据:

a <- data.frame("Type" = replicate(5, "A"),
                "Day" = replicate(5, "Monday"),
                "Zone" = c(1:5),
                "Class" = c(0, 0, 1, 2, 3))

我正在尝试对Zone列进行转换,以使每个条目都是一个新列,并在每个Zone列下方是Class列中的对应值。
到目前为止,这就是我所拥有的:

library(reshape2)
library(plyr)
b <- dcast(a, Type+Day+Class~Zone)

b <- plyr::rename(b, c("1" = "Zone_1",
                       "2" = "Zone_2",
                       "3" = "Zone_3",
                       "4" = "Zone_4",
                       "5" = "Zone_5"))

结果是:

  Type    Day Class Zone_1 Zone_2 Zone_3 Zone_4 Zone_5
1    A Monday     0      0      0     NA     NA     NA
2    A Monday     1     NA     NA      1     NA     NA
3    A Monday     2     NA     NA     NA      2     NA
4    A Monday     3     NA     NA     NA     NA      3

但是,我正试图得到这个:

  Type    Day  Zone_1 Zone_2 Zone_3 Zone_4 Zone_5
1    A Monday       0      0      1      2      3

关于如何减少桌子的任何建议都这样吗?
另外,如果有人有更好的重命名列的方法(如果需要),我也希望看到,因为我的方法看起来很重复。

2 个答案:

答案 0 :(得分:4)

a <- data.frame("Type" = replicate(5, "A"),
                "Day" = replicate(5, "Monday"),
                "Zone" = c(1:5),
                "Class" = c(0, 0, 1, 2, 3))

library(tidyverse)

a %>%
  mutate(Zone = paste0("Zone_", Zone)) %>%  # update Zone column
  spread(Zone, Class)                       # reshape data

#   Type    Day Zone_1 Zone_2 Zone_3 Zone_4 Zone_5
# 1    A Monday      0      0      1      2      3

如@zack在下面的注释中所建议的,如果我们像这样使用sep中的spread参数,则无需提前更新变量:

a %>% spread(Zone, Class, sep = "_")

答案 1 :(得分:2)

使用数据表,您可以尝试以下操作:

    library(data.table)

    a <- data.frame("Type" = replicate(5, "A"),
                "Day" = replicate(5, "Monday"),
                "Zone" = c(1:5),
                "Class" = c(0, 0, 1, 2, 3))
    setDT(a)
    dcast(a, Type + Day ~ paste0("Zone_", Zone), value.var = "Class")

     Type    Day Zone_1 Zone_2 Zone_3 Zone_4 Zone_5
        A Monday      0      0      1      2      3