根据列的值对行进行子集

时间:2019-05-24 01:13:11

标签: r dplyr

我需要根据列的过滤值对行进行子集并按     另一列。

    Bowler               dismissal_kind      
    F du Plessis          stumped   
    MJ McClenaghan        run out   
    F du Plessis          bowled
    HH pandya              lbw
    HH pandya             bowled
    F du Plessis          caught
    F du Plessis          run out
    JJ Bumrah             caught 
    DL Chahar

我尝试使用max和count,但是没有解决。这里的dismissal_kind是一个字符变量。

innings%>%
summarise(wickets =  max(count(dismissal_kind %in% c("stumped", 
"bowled", "lbw","caught"))))%>%
group_by(bowler)%>%
arrange(desc(wickets))%>%
top_n(10)

我想按投球手分组,只计算过滤后的行。我想要类似的东西

bowler              dismissal_kind
F du Plessis         3
HH pandya            2
JJ Bumrah            1

How can i achieve this result. I am not able to sum this character variable. Is there any workaround to achieve this expected result.

1 个答案:

答案 0 :(得分:0)

因此,您可以在语句TRUE中总结dismissal_kind %in% c("stumped", "bowled", "lbw","caught")的出现,

tb %>% 
  group_by(Bowler) %>% 
  summarise(Count_Wickets = sum(dismissal_kind %in% c("stumped", 
                                      "bowled", "lbw","caught"))) %>% 
  arrange(desc(Count_Wickets))

# A tibble: 5 x 2
  Bowler         Count_Wickets
  <chr>                  <int>
1 F du Plessis               3
2 HH pandya                  2
3 JJ Bumrah                  1
4 DL Chahar                  0
5 MJ McClenaghan             0

数据:

tibble::tribble(
~Bowler, ~dismissal_kind,      
"F du Plessis", "stumped",   
"MJ McClenaghan", "run out",   
"F du Plessis", "bowled",
"HH pandya", "lbw",
"HH pandya", "bowled",
"F du Plessis", "caught",
"F du Plessis", "run out",
"JJ Bumrah", "caught",
"DL Chahar", NA    
) -> tb