有没有一种方法可以根据观察名称将数据值从一个df读取/导入到另一个df?

时间:2019-05-07 12:16:01

标签: r

epl_schedule_df

epl_ratings_df

我有一个拥有英超联赛得分的df,还有一个拥有整个赛季赛程的df。我希望能够将df各队的评分附加到时间表中,以便得出每场比赛的概率。下一步将是模拟整个季节。

我试图编写一个if语句来将df_1的字符串与df_2匹配,但是我不认为自己走在正确的道路上。

我确信这对大多数人来说都是低级编码,我感谢您的帮助。在来到这里之前,我曾尝试过。我真心的谢谢你。

vec_1 <- c("team_a", "team_b", "team_c")
vec_2 <- c(1.7, 1.2, 0.8)
vec_3 <- c("team_d", "team_e", "team_f")
vec_4 <- c(0.3, 0.5, 0.4)

# df_1 ratings df

df_1 <- data_frame(team = vec_1, rating = vec_2)

 team   rating
  <chr>   <dbl>
1 team_a    1.7
2 team_b    1.2
3 team_c    0.8

# df_2 schedule df

df_2 <- data_frame(home_tm = vec_1, away_tm = vec_3)

  home_tm away_tm
  <chr>   <chr>  
1 team_a  team_d 
2 team_b  team_e 
3 team_c  team_f 

所需结果:

  home_tm away_tm home_tm_rat away_tm_rat
  <chr>   <chr>         <dbl>         <dbl>
1 team_a  team_d          1.7           0.3
2 team_b  team_e          1.2           0.5
3 team_c  team_f          0.8           0.4
......
......
......

2 个答案:

答案 0 :(得分:1)

如上所述,可以从join中检查dplyr

df_2 %>% 
  left_join(df_1, by= c('home_tm' = 'team')) %>% 
  rename(home_tm_rat = rating) %>% 
  left_join(df_1, by = c('away_tm' = 'team')) %>% 
  rename(away_tm_rat = rating) 

# A tibble: 3 x 4
  home_tm away_tm home_tm_rat away_tm_rat
  <chr>   <chr>         <dbl>       <dbl>
1 team_a  team_d          1.7         0.3
2 team_b  team_e          1.2         0.5
3 team_c  team_f          0.8         0.4

答案 1 :(得分:0)

类似于@liuminzhao,但我也建议您稍微考虑一下您的数据结构。如果您将df_2中的所有团队都放在一个列中,并用一个单独的列指示谁是主场/不在场,事情将会变得更容易。进一步了解tidy data here

library(tidyverse)

df_2 %>% 
  #gather the two columns of teams into a single column, using another column to indicate home/away
  gather(key = HomeAway, value = team) %>% 
  #join the team ratings
  left_join(df_1, by = c("team" = "team"))


# A tibble: 6 x 3
HomeAway team   rating
<chr>    <chr>   <dbl>
1 home_tm  team_a    1.7
2 home_tm  team_b    1.2
3 home_tm  team_c    0.8
4 away_tm  team_d   NA  
5 away_tm  team_e   NA  
6 away_tm  team_f   NA