R:重新格式化运动队阵容数据

时间:2014-07-28 16:05:12

标签: r reshape melt

我在R中有很多运动队的数据和他们的比赛首发阵容。我的数据集的一个例子如下:

matchdata <- data.frame(match_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2), player_name = c("andrew", "david", "james", "steve", "tim", "dan",
"john", "phil", "matthew", "simon", "ben", "brian", "evan", "tony", "will",
"alister", "archie", "paul", "peter", "warren"), played_for = c("team a", "team a",
"team a", "team a", "team a", "team b", "team b", "team b", "team b", "team b",
"team c", "team c", "team c", "team c", "team c", "team d", "team d", "team d",
"team d", "team d"), played_against = c("team b", "team b", "team b", "team b",
"team b", "team a", "team a", "team a", "team a", "team a", "team d", "team d",
"team d", "team d", "team d", "team c", "team c", "team c", "team c", "team c"),
score_for = c(2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0),
score_against = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3))

我想要达到的目的是为每个比赛日的每个“球员与球员”比赛创建一个单独的条目。我希望我的输出看起来像:

output <- data.frame(match_id = 1, player_name = "andrew", played_against = c("dan",
"john", "phil", "matthew", "simon"), score_for = 2, score_against = 1)

因此,我可以在一天的基础上分析和比较各种表现,而不是每个球员在当天对阵每支球队。

编辑:我只想将玩家与OPPOSING团队中的玩家进行比较。此外,我只需要将球员与他们在MATCH_ID上面对的球队进行比较。因此,在这个例子中,每个玩家将有5行条目(团队中每个玩家在该特定比赛中对抗他们的比赛时为1)

有人能帮我找到最好的方法吗?我有一些使用重塑或融化的经验,但在这种情况下无法生成我想要的东西。

有人可以推荐最好的方法来满足我的需求吗?

1 个答案:

答案 0 :(得分:1)

也许你正在寻找这样的东西?

md <- matchdata[c('match_id', 'player_name', 'played_for', 'score_for', 'score_against')]
player.combos <- with(matchdata, expand.grid(player_name=player_name, played_against=player_name))
player.combos.teams <- merge(player.combos, md, by.x='played_against', by.y='player_name')[c('player_name', 'played_against', 'played_for')]
subset(merge(md, player.combos.teams, by='player_name'), 
    played_for.x != played_for.y, select=c('match_id', 'player_name', 'played_against', 'score_for', 'score_against'))

# HEAD:
# 
#   match_id player_name played_against score_for score_against
# 2        1      andrew           john         2             1
# 6        1      andrew          simon         2             1
# 7        1      andrew            dan         2             1
# 8        1      andrew        matthew         2             1
# 9        1      andrew           phil         2             1
# 
#   ---  40  rows omitted ---
# 
# TAIL:
#     match_id player_name played_against score_for score_against
# 91         1         tim          simon         2             1
# 95         1         tim           john         2             1
# 96         1         tim            dan         2             1
# 99         1         tim        matthew         2             1
# 100        1         tim           phil         2             1