dplyr总结了逻辑条件

时间:2016-04-12 19:04:21

标签: r conditional dplyr summary

我有以下数据框

df <- data.frame(Gender = c(rep(c("M","F"),each=4)),
             DiffA=c(1,1,-1,-1,1,1,1,-1),
             DiffB=c(1,-1,1,-1,1,1,1,-1))

我想创建2个新变量,这些变量总结每个性别i)DiffA和DiffB为正的行数和ii)DiffA和DiffB为负的行数,以便获得:

df2 <- data.frame(Gender = c("M","F"),
             Diff_Pos=c(1,3),
             Diff_Neg=c(1,1))

我无法组合dplyr n()的summary函数,它返回行数和所需的逻辑语句。提前致谢

3 个答案:

答案 0 :(得分:3)

我会考虑做

library(tidyr)
df %>% filter(DiffA == DiffB) %>% count(Gender, DiffA) %>% spread(DiffA, n)

  Gender    -1     1
#   (fctr) (int) (int)
# 1      F     1     3
# 2      M     1     1

类似的data.table代码是

dcast(df[DiffA == DiffB, .N, by=.(Gender, DiffA)], Gender ~ DiffA)

#    Gender -1 1
# 1:      F  1 3
# 2:      M  1 1

如果您的真实数据超出-11,请将相关列包装在sign()中。

答案 1 :(得分:1)

这是base R选项

 with(subset(df, DiffA==DiffB), table(Gender, DiffA))
 #      DiffA
 #Gender -1 1
 #     F  1 3
 #     M  1 1

答案 2 :(得分:0)

这应该有效:

df %>% 
  dplyr::mutate(
    Diff_Pos = DiffA > 0 & DiffB > 0,
    Diff_Neg = DiffA < 0 & DiffB < 0) %>% 
  dplyr::group_by(Gender) %>% 
  dplyr::summarise(
    Diff_Pos = sum(Diff_Pos),
    Diff_Neg = sum(Diff_Neg))