R基于不同列的运行计数

时间:2018-03-12 02:08:31

标签: r cumsum

我想根据以前在ColumnB中出现的次数来获取ColumnA中某些内容的运行计数。理想情况下,此计数也可以是ColumnC的子集。

例如,我想在这里获得胜利者之前的LOSSES或输家的上一次胜利的总计:

XML View:
<Input id="input" width="25px" change="handleChange" </Input>
<CheckBox id="2" text ="Hello" select="ChkSel"> </CheckBox>
<CheckBox id="3" text ="Hello2" select="ChkSel"> </CheckBox>
<CheckBox id="4" text ="Hello3" select="ChkSel"> </CheckBox>

Controller:
    handleChange: function(oEvent) {
            var newValue = oEvent.getParameter("value");
            var check1 = this.getView().byId("2").getSelected();
            var check2 = this.getView().byId("3").getSelected();
            var check3 = this.getView().byId("4").getSelected();

            if (newValue !== "") {
                if (!check1 && !check2 && !check3) {
                                        this.getView().byId("input").setValueState(sap.ui.core.ValueState.Error);
                    MessageToast.show("Select appropriate checkbox");
                }
            }
            if (newValue === "") {
                if (check1 || check2 || check3) {
                   MessageToast.show("Enter text or deselect checkbox");
                } 

            }

        },
        ChkSel: function(oEvent) {
            var newValue = this.getView().byId("input").getValue();
            var check1 = this.getView().byId("2").getSelected();
            var check2 = this.getView().byId("3").getSelected();
            var check3 = this.getView().byId("4").getSelected();
            if (newValue !== "") {
                if (check1 || check2 || check3) {
                    this.getView().byId("input").setValueState("None");
                } else {
                    this.getView().byId("input").setValueState(sap.ui.core.ValueState.Error);
                }

            }

我希望输出是我的原始数据框,但有四个新列:winner_cum_wins,winner_cum_losses,loser_cum_wins,loser_cum_losses。

2 个答案:

答案 0 :(得分:2)

这应该为您提供所需的所有数据框:

library(tidyverse)
df %>% 
    group_by(year) %>% 
    mutate(match_id_year = row_number()) %>% 
    gather(outcome, name, -year, -match_id_year) %>% 
    arrange(year, match_id_year) %>% 
    group_by(year, name) %>% 
    mutate(cum_wins_year = cumsum(outcome == "winner"),
           cum_losses_year = cumsum(outcome == "loser"))

答案 1 :(得分:0)

year <- c(2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016, 2016)
winner <- c('sam', 'ryan', 'sally', 'sally', 'ryan', 'sally', 'mike', 'ryan', 'mike', 'sam')
loser <- c('mike', 'mike', 'ryan', 'sam', 'sam', 'mike', 'sally', 'mike', 'ryan', 'sally')
df <- data.frame(year, winner, loser)

#successul methods for getting winner's cumulative wins or loser's cumulative losses by year
df <- df %>% group_by(year, winner) %>% mutate(winner_wins = row_number())
df <- df %>% group_by(year, loser) %>% mutate(loser_losses = row_number())

我创建了以下函数,该函数计算xy之前出现的次数。

count_wins_losses <- function(x,y){
  n = length(x)
  counts = numeric(n)
  for (i in 1:n){
    counts2 = numeric(i)
    for (j in 1:i){counts2[j] = sum(x[i] == y[j])}
    counts[i] = sum(counts2)
  }
  return(counts)
}

我使用split将功能应用到每年。

# count the cummullative wins of the losers
loser_cum_wins <- df %>%
  split(year) %>%
  lapply(., function(x) count_winner_losses(x$loser, x$winner)) %>%
  unlist()

# count the cummulative losses of the winners
winner_cum_losses <- df %>%
  split(year) %>%
  lapply(., function(x) count_winner_losses(x$winner, x$loser)) %>%
  unlist()

这里完成了arrange,以便dfloser_cum_winswinner_cum_losses中的年份匹配。

df <- arrange(df, year)
df$loser_cum_wins <- loser_cum_wins
df$winner_cum_losses <- winner_cum_losses
df

## A tibble: 10 x 7
## Groups:   year, loser [6]
#    year winner loser winner_wins loser_losses loser_cum_wins winner_cum_losses
#   <dbl> <chr>  <chr>       <int>        <int>          <dbl>             <dbl>
# 1 2016. sally  mike            1            1             0.                0.
# 2 2016. mike   sally           1            1             1.                1.
# 3 2016. ryan   mike            1            2             1.                0.
# 4 2016. mike   ryan            2            1             1.                2.
# 5 2016. sam    sally           1            2             1.                0.
# 6 2017. sam    mike            1            1             0.                0.
# 7 2017. ryan   mike            1            2             0.                0.
# 8 2017. sally  ryan            1            1             1.                0.
# 9 2017. sally  sam             2            1             1.                0.
#10 2017. ryan   sam             2            2             1.                1.

使用count_wins_losses()函数的另一种方法是按df过滤year并为每个拆分使用该函数,然后将结果合并。

df2016 <- df %>%
  filter(year == 2016)
df2017 <- df %>%
  filter(year == 2017)

df2016$loser_cum_wins <- with(df2016, count_winner_losses(loser, winner))
df2016$winner_cum_losses <- with(df2016, count_winner_losses(winner, loser))
df2017$loser_cum_wins <- with(df2017, count_winner_losses(loser, winner))
df2017$winner_cum_losses <- with(df2017, count_winner_losses(winner, loser))
rbind(df2016,df2017)