使用具有多变量功能的daply

时间:2016-12-09 20:49:24

标签: r plyr

我还是R的新手,想要使用* ply函数从数据帧中提取信息。示例输入数据框如下所示:

# Construct the dataframe
season <- c("12","12","12","12","12")
hometeam <- c("Team A","MyTeam","MyTeam","Team D","Team E")
awayteam <- c("MyTeam","Team B","Team C","MyTeam","MyTeam")
score <- c("1 - 1","7 - 1","0 - 0","0 - 2","0 - 1")
stats <- data.frame(season,hometeam,awayteam,score)
print(stats)


  season hometeam awayteam score
1  11/12   Team A   MyTeam 1 - 1
2  11/12   MyTeam   Team B 7 - 1
3  11/12   MyTeam   Team C 0 - 0
4  11/12   Team D   MyTeam 0 - 2
5  11/12   Team E   MyTeam 0 - 1

我想做的是提取“MyTeam”的对手以及获胜者。得分总是作为主队的得分与客队的得分。我有办法提取对手是谁:

# Get the opponent to MyTeam; can add this to dataframe later
opponent <- ifelse(stats$hometeam == "MyTeam", stats$awayteam, stats$hometeam)

但是我很难想要赢得每场比赛的胜利者。我尝试使用daply()和命名函数这样做:

# Separate out scores for home and away team to determine winner
stats <- separate(stats, score, c('homescore','awayscore'), sep=' - ', remove=TRUE)

# Function for use in ply to get the winner of a match
determineWinner <- function(homescore, awayscore, hometeam) {
  homewon <- FALSE
  if ( homescore < awayscore) {
    homewon <- FALSE
  } else if ( homescore > awayscore ) { 
    homewon <- TRUE
  } else {
    return("tie")
  }
  if ( hometeam == "MyTeam" ) { 
    ifelse(homewon, return("won"), return("lost"))
  } else {
    ifelse(homewon, return("lost"), return("won"))
  }
}#end of function

winner <- daply(stats, .(homescore,awayscore,hometeam), determineWinner(stats$homescore, stats$awayscore, stats$hometeam) )

但是,这显然不起作用。我是否错误地应用了daply()方法?我认为我仍然不确定* ply函数的真实表现。这似乎是一个* ply功能是去这里的方式,但如果有其他解决方案,我都是耳朵。任何帮助是极大的赞赏!

1 个答案:

答案 0 :(得分:3)

您的逻辑可以使用嵌套的ifelse实现:

winner <- ifelse(stats$homescore > stats$awayscore,
             ifelse(stats$hometeam == "MyTeam","won","lost"),
             ifelse(stats$homescore < stats$awayscore,
                    ifelse(stats$hometeam == "MyTeam","lost","won"),
                    "tie"))
##[1] "tie" "won" "tie" "won" "won"