在R中使用dplyr和group_by得不到预期的结果

时间:2017-12-06 19:07:25

标签: r group-by dplyr

尝试汇总数据集,但不对指定的变量进行分组

数据集中的示例test2

  newClientID   Month      newApp          count    app
  100           November    R              51       Other
  100           November    Tableau        58       Other
  100           October     R              12       Other
  100           October     Tableau        212      Other
  100           September   R              72       Other
  100           September   Tableau        74       Other
  100           October     SQL Assistant  11       Other
  100           September   SQL Assistant  396      Other

这应该总结数据

test3 <- test2 %>%
   group_by(newClientID, Month, app) %>%
   summarise(total = sum(count)) 

应该是这样的

newClientID Month        app    total
100         November     Other  109
100         October      Other  235
100         September    Other  542

但我得到了

newClientID Month        app    total
100         November     Other  109
100         October      Other  224
100         September    Other  146
100         October      Other  11
100         September    Other  396

为什么还要将Month变量分组?

1 个答案:

答案 0 :(得分:0)

谢谢。 newClientID有空格。我做了以下操作来修复数据集中的所有列:

test2<- data.frame(lapply(test2, function(x) if(class(x)=="character") trimws(x) else(x)), stringsAsFactors=F)