这是我第一次在这里发布问题,请保持温柔:)
我有一个数据框,其中包含来自英式足球/英超联赛(高级联赛)的进球数和角球统计数据,每行一场。
负责人(总理)会给你这样的东西(组成数据):
| Home | Home_goals | Away | Away_goals | Home_Corners | Away_Corners |
|------------ |------------ |----------- |------------ |-------------- |-------------- |
| Tottenham | 1 | Arsenal | 0 | 5 | 2 |
| Man United | 2 | Watford | 1 | 7 | 4 |
| Man City | 3 | West Ham | 0 | 10 | 2 |
| Chelsea | 2 | Arsenal | 1 | 7 | 6 |
| Tottenham | 4 | Norwich | 1 | 6 | 0 |
| Man United | 2 | Liverpool | 2 | 4 | 7 |
| Tottenham | 0 | Man City | 2 | 3 | 8 |
我想为Home栏中的每个条目(在本例中为托特纳姆热刺)中找到下两个匹配的条目(第5行和第7行),并将它们粘贴到第1行的新列中。
我想对数据框中的每一行执行此操作,并保留所有行。我只想将接下来两场比赛的统计信息添加为新列:
首页_2
Home_goals_2
Away_2,依此类推。
老实说,我什至不知道该如何在Google上进行搜索,就我在stackoverflow方面的经验来看,我相信你们中的某些人会在几分钟之内解决这个问题:)
非常感谢您的帮助。
在此先多谢了
菲利普
编辑:
我真的不知道我是否可以在这里附加东西,但是数据框是这样的:
premierleague <- data.frame("Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"),
"Home_goals" = c(1,2,3,2,4,2,0),
"Away" = c ("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"),
"Away_goals" = c(0,1,0,1,1,2,2),
"Home_corners" = c(5,7,10,7,6,4,3),
"Away_corners" = c(2,4,2,6,0,7,8))
### The desired result looks like this
premierleague_new <- data.frame(
"Home" = c("Tottenham", "ManUnited", "ManCity", "Chelsea", "Tottenham", "ManUnited", "Tottenham"),
"Home_goals" = c(1,2,3,2,4,2,0),
"Away" = c("Arsenal", "Watford", "Westham", "Arsenal", "Norwich", "Liverpool", "ManCity"),
"Away_goals" = c(0,1,0,1,1,2,2),
"Home_corners" = c(5,7,10,7,6,4,3),
"Away_corners" = c(2,4,2,6,0,7,8),
"Home_goals_2" = c(4,2,NA,NA,0,NA,NA),
"Away_2" = c("Norwich", "Liverpool",NA,NA,"ManCity",NA,NA),
"Away_goal_2" = c(1,2,NA,NA,2,NA,NA),
"Home_corn_2" = c(6,4,NA,NA,3,NA,NA),
"Away_corn_2" = c(0,7,NA,NA,8,NA,NA),
"Home_goal_3" = c(0,NA,NA,NA,NA,NA,NA),
"Away_3" = c("ManCity",NA,NA,NA,NA,NA,NA),
"Away_goal_3" = c(2,NA,NA,NA,NA,NA,NA),
"Home_corners_3" = c(3,NA,NA,NA,NA,NA,NA),
"Away_corners_3" = c(8,NA,NA,NA,NA,NA,NA)
)
托特纳姆热刺是唯一一支在全部3场比赛中都入选的球队,因此托特纳姆热刺的所有列均已填入第一行。
在第5行中,热刺的第二项仅具有第二场比赛的值,因为在此示例中,以热刺为主队的只有第二项。
我希望现在更加清楚。应该至少是可复制的。
答案 0 :(得分:1)
我们可以group_by
Home
并使用lead
从下一行获取值。
library(dplyr)
premierleague %>%
group_by(Home) %>%
mutate_at(vars(Home_goals:Away_corners), list(`2` = ~lead(.), `3` = ~lead(., 2)))
# Home Home_goals Away Away_goals Home_corners Away_corners Home_goals_2 Away_2
# <fct> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <fct>
#1 Tott… 1 Arse… 0 5 2 4 Norwi…
#2 ManU… 2 Watf… 1 7 4 2 Liver…
#3 ManC… 3 West… 0 10 2 NA NA
#4 Chel… 2 Arse… 1 7 6 NA NA
#5 Tott… 4 Norw… 1 6 0 0 ManCi…
#6 ManU… 2 Live… 2 4 7 NA NA
#7 Tott… 0 ManC… 2 3 8 NA NA
# … with 8 more variables: Away_goals_2 <dbl>, Home_corners_2 <dbl>,
# Away_corners_2 <dbl>, Home_goals_3 <dbl>, Away_3 <fct>, Away_goals_3 <dbl>,
# Home_corners_3 <dbl>, Away_corners_3 <dbl>