data.frame:有条件地添加行

时间:2018-12-05 22:59:48

标签: r dataframe

我有一个数据框punkt_tabelle,其中包含游戏中的得分。每个游戏有2套或3套(MRE中为3套)。数据框包含这些点的制作方式。我也有分数的结尾,该分数存储在scores中。 我计算每个组中每个团队的总和。 (我在total_pts中做到了)。

我要达到的目的是将数据表(每个团队和每个集合)的得分总和与该团队根据scores得出的得分进行比较。如果此集合中的scores大于按total计算得出的总和,那么我想向数据表中添加额外的一行。此新行应包含队名,此行的设置和技能应为“其他错误” ,并且Pkt的值应为{{1} }和scores。也许(在MRE中就是这种情况),必须为每个团队和每个组添加一个新行。

如果您要在添加新行后重新重新运行total_pts计算,则它将等于total中的结果。

我根据这些问题和文章(R Conditional evaluation when using the pipe operator %>%Inserting a new row to data frame for each group id)尝试了以下代码的变体,但无法解决我的问题。

这是我代码的最新版本:

scores

可以这样进行吗?还是我需要使用循环并为每个组和每个团队手动执行? 请帮忙!

编辑: 此示例中的预期输出如下所示:

library(dplyr)
library (devtools)

punkt_tabelle <- structure(list(Team = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
                 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 
                 1L), .Label = c("Miller/Myer", "Winter/Summer"), class = "factor"), 
                 Skill = structure(c(1L, 1L, 3L, 2L, 2L, 2L, 1L, 1L, 3L, 2L, 
                 2L, 2L, 4L, 4L, 5L, 6L, 6L, 6L, 4L, 4L, 5L, 6L, 6L, 6L), .Label = c("Attack", 
                 "Service", "Block", "Opp. Attack Error", "Opp. Block Error", 
                 "Opp. Serve Error"), class = "factor"), Set = c(2L, 3L, 2L, 
                 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 3L, 
                 1L, 2L, 3L, 1L, 2L, 3L), Pkt = c(2L, 1L, 1L, 0L, 0L, 0L, 
                 3L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 
                 0L, 1L, 0L)), row.names = c(NA, -24L), vars = c("Team", "Skill"
                 ), indices = list(0:1, 2L, 18:19, 20L, 21:23, 3:5, 6:7, 8L, 12:13, 
                 14L, 15:17, 9:11), group_sizes = c(2L, 1L, 2L, 1L, 3L, 3L, 
                 2L, 1L, 2L, 1L, 3L, 3L), biggest_group_size = 3L, labels = structure(list(
                 Team = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                 2L, 2L), .Label = c("Miller/Myer", "Winter/Summer"), class = "factor"), 
                 Skill = c("Attack", "Block", "Opp. Attack Error", "Opp. Block Error", 
                 "Opp. Serve Error", "Service", "Attack", "Block", "Opp. Attack Error", 
                 "Opp. Block Error", "Opp. Serve Error", "Service")), row.names = c(NA, 
                 -12L), class = "data.frame", vars = c("Team", "Skill")), class = c("grouped_df", 
                  "tbl_df", "tbl", "data.frame"))



score_miller_myer <- c(3,6,3) #total points in sets 1, 2, 3
score_winter_summer <- c(5,4,5)
scores <- c(score_miller_myer, score_winter_summer)

#calculate the sum of the points per team and per set
total_pts <- punkt_tabelle %>% group_by(Team, Set) %>% summarize(total = sum(Pkt))
total_pts

#try to compare with the score and add en entry in the dataframe
punkt_tabelle %>% 
  group_by (Team, Set) %>% 
  mutate(total = sum(Pkt)) %>% 
  {if (total<scores) dplyr::bind_rows(Team=Team, Set=Set, Skill="Opp. Other Error", Pkt=(total-scores))}

punkt_tabelle

该问题的进一步说明: 一个团队以各种方式得分。他们要么自己得分(进攻,发球,盖帽),要么对手犯错(进攻错误,进攻发球,盖帽错误)。仍然给他们达到的总分留下一些差异,因为对手的一些错误未指定。为此,我想在计算出差异后添加一行“其他错误”。

示例:在第26行中,Pkt的值为1,因为在第2组团队的Team Skill Set Pkt <fct> <fct> <int> <int> 1 Miller/Myer Attack 2 2 2 Miller/Myer Attack 3 1 3 Miller/Myer Block 2 1 4 Miller/Myer Service 1 0 5 Miller/Myer Service 2 0 6 Miller/Myer Service 3 0 7 Winter/Summer Attack 1 3 8 Winter/Summer Attack 2 1 9 Winter/Summer Block 3 1 10 Winter/Summer Service 1 0 11 Winter/Summer Service 2 1 12 Winter/Summer Service 3 1 13 Winter/Summer Opp. Attack Error 2 0 14 Winter/Summer Opp. Attack Error 3 0 15 Winter/Summer Opp. Block Error 2 0 16 Winter/Summer Opp. Serve Error 1 0 17 Winter/Summer Opp. Serve Error 2 1 18 Winter/Summer Opp. Serve Error 3 1 19 Miller/Myer Opp. Attack Error 1 1 20 Miller/Myer Opp. Attack Error 2 0 21 Miller/Myer Opp. Block Error 3 0 22 Miller/Myer Opp. Serve Error 1 0 23 Miller/Myer Opp. Serve Error 2 1 24 Miller/Myer Opp. Serve Error 3 0 25 Winter/Summer Opp. Other Error 1 2 #here start the added rows 26 Winter/Summer Opp. Other Error 2 1 27 Winter/Summer Opp. Other Error 3 2 28 Miller/Myer Opp. Other Error 1 2 29 Miller/Myer Opp. Other Error 2 2 30 Miller/Myer Opp. Other Error 3 2 中,Winter / Summer获得3分。但是他们在第2组中根据total_pts的得分是4分。因此,在新行中添加了1点的差异。

1 个答案:

答案 0 :(得分:0)

这里有可能。

  1. 首先,我们需要将scores存储在data.frame中,其中包含有关TeamSet的信息

    df.scores <- data.frame(
        Team = c(rep("Miller/Myer", 3), rep("Winter/Summer", 3)),
        Set = 1:3,
        scores = scores)
    

    让我们检查df.scores

    df.scores
    #           Team Set scores
    #1   Miller/Myer   1      3
    #2   Miller/Myer   2      6
    #3   Miller/Myer   3      3
    #4 Winter/Summer   1      5
    #5 Winter/Summer   2      4
    #6 Winter/Summer   3      5
    
  2. 接下来,我们对punk_tabelle和{{1}进行df.scoresTeam的左Set左连接,并按{{ 1}}和Total = sum(Pkt);然后TeamSetOpp. Other Error之间的差给出。通过长到宽到长的转换,可以达到最终的预期输出。

    scores