成对获奖者;或向量值的group_by变异?

时间:2018-09-26 19:59:14

标签: r dplyr

我正在尝试评估一对中的哪个单元是“获胜者”。 group_by() %>% mutate()接近正确的事物,但还不完全正确。特别是

dat %>% group_by(pair) %>% mutate(winner = ifelse(score[1] > score[2], c(1, 0), c(0, 1)))不起作用。

以下内容确实有用,但中间汇总数据帧比较笨拙。我们可以改善这一点吗?

library(tidyverse)
set.seed(343)
# units within pairs get scores
dat <-
  data_frame(pair = rep(1:3, each = 2),
             unit = rep(1:2, 3),
             score = rnorm(6))

# figure out who won in each pair
summary_df <- 
  dat %>%
  group_by(pair) %>%
  summarize(winner = which.max(score))

# merge back and determine whether each unit won
dat <- 
  left_join(dat, summary_df, "pair") %>%
  mutate(won = as.numeric(winner == unit))
dat
#> # A tibble: 6 x 5
#>    pair  unit  score winner   won
#>   <int> <int>  <dbl>  <int> <dbl>
#> 1     1     1 -1.40       2     0
#> 2     1     2  0.523      2     1
#> 3     2     1  0.142      1     1
#> 4     2     2 -0.847      1     0
#> 5     3     1 -0.412      1     1
#> 6     3     2 -1.47       1     0

reprex package(v0.2.0)于2018-09-26创建。

可能与Weird group_by + mutate + which.max behavior

有关

2 个答案:

答案 0 :(得分:2)

您可以这样做:

dat %>% 
  group_by(pair) %>% 
  mutate(won = score == max(score),
         winner = unit[won == TRUE]) %>% 
   # A tibble: 6 x 5
   # Groups:   pair [3]
   pair  unit  score won   winner
  <int> <int>  <dbl> <lgl>  <int>
1     1     1 -1.40  FALSE      2
2     1     2  0.523 TRUE       2
3     2     1  0.142 TRUE       1
4     2     2 -0.847 FALSE      1
5     3     1 -0.412 TRUE       1
6     3     2 -1.47  FALSE      1

答案 1 :(得分:1)

使用rank

dat %>% group_by(pair) %>% mutate(won = rank(score) - 1)

使用比较结果(score[1] > score[2])为带有“获胜选择项”的向量编制索引,从而获得更多乐趣(且速度稍快):

dat %>% group_by(pair) %>%
  mutate(won = c(0, 1, 0)[1:2 + (score[1] > score[2])])