在R

时间:2017-06-27 18:44:49

标签: r dplyr

我有一个像这样的数据框

ID <- c("ID001","ID001","ID003","ID003","ID003",
        "ID006","ID007","ID007","ID009","ID010")
Type <- c("Length","Breadth","Length","Breadth","Height",
          "Length","Length","Height","Breadth","Length")
FailCount <- c(3,7,2,3,9,7,3,2,3,9)

df <- data.frame(ID,Type,FailCount)

我正试图通过这些条件对这个数据框进行子集

  1. 删除任何只有1种类型的ID
  2. 总结失败次数
  3. 将“类型”列转换为以逗号分隔的1行
  4. 所需的输出

         ID                    Type FailCount
      ID001         Length, Breadth   10
      ID003 Length, Breadth, Height   14
      ID007          Length, Height    5
    

    我可以用这种方式删除只有一种类型的行

    library(dplyr)
    df <- df %>% group_by(ID) %>% filter(n_distinct(Type) > 1)
    

    我如何完成其​​他任务?有人能指出我正确的方向吗?

2 个答案:

答案 0 :(得分:2)

试试这个

library(dplyr)
        df <- df %>% group_by(ID) %>% filter(n_distinct(Type) > 1)%>%dplyr::summarise(Type=paste(Type,collapse=','),FailCount=sum(FailCount))

# A tibble: 3 × 3
      ID                  Type FailCount
  <fctr>                 <chr>     <dbl>
1  ID001        Length,Breadth        10
2  ID003 Length,Breadth,Height        14
3  ID007         Length,Height         5

答案 1 :(得分:2)

您可以使用summarise获取所需内容:

df %>% group_by(ID) %>%
    dplyr::filter(n_distinct(Type) > 1) %>%
    summarise(Type=toString(Type), FailCount = sum(FailCount))

我希望这会有所帮助。