Question

示例数据如下：

x <- read.table(header=T, text="
ID CostType1 Cost1 CostType2 Cost2
                1 a 10 c 1
                2 b 2  c 20
                3 a 1  b 50
                4 a 40 c 1
                5 c 2  b 30
                6 a 60 c 3
                7 c 10 d 1 
                8 a 20 d 2")

我希望第二列和第三列（CostType1和CostType 2）成为新列的名称，并将相应的成本填入特定成本类型。如果没有匹配，请填写NA。理想的格式如下：

          a  b  c  d
        1 10 NA 1  NA
        2 NA 2  20 NA
        3 1  50 NA NA
        4 40 1  NA NA
        5 NA 30 2  NA 
        6 60 NA 3  NA
        7 NA NA 10 1
        8 20 NA NA 2

Answer 1

使用tidyverse的解决方案。我们可以先得到有多少组。在此示例中，有两个组。我们可以转换每个组，组合它们，然后使用列中的第一个非NA值汇总数据框。

library(tidyverse)

# Get the group numbers
g <- (ncol(x) - 1)/2

x2 <- map_dfr(1:g, function(i){
  # Transform the data frame one group at a time
  x <- x %>%
    select(ID, ends_with(as.character(i))) %>%
    spread(paste0("CostType", i), paste0("Cost", i))
  return(x)
  }) %>% 
  group_by(ID) %>%
  # Select the first non-NA value if there are multiple values
  summarise_all(funs(first(.[!is.na(.)])))
x2
# # A tibble: 8 x 5
#      ID     a     b     c     d
#   <int> <int> <int> <int> <int>
# 1     1    10    NA     1    NA
# 2     2    NA     2    20    NA
# 3     3     1    50    NA    NA
# 4     4    40    NA     1    NA
# 5     5    NA    30     2    NA
# 6     6    60    NA     3    NA
# 7     7    NA    NA    10     1
# 8     8    20    NA    NA     2

Answer 2

使用reshape

的基础解决方案

x1 <- setNames(x[,c("ID", "CostType1", "Cost1")], c("ID", "CostType", "Cost"))
x2 <- setNames(x[,c("ID", "CostType2", "Cost2")], c("ID", "CostType", "Cost"))

reshape(data=rbind(x1, x2), idvar="ID", timevar="CostType", v.names="Cost", direction="wide")

将多个列转换为列名称并使用R中的值填充

2 个答案: