循环通过df并将值存储在新的df中

时间:2015-12-03 10:05:10

标签: r

我有以下df

 COMPETITION               TEAM1             TEAM2 pointsH pointsA     DATUM
1 Premier League   Manchester United      Swansea City       0       1 16-8-2014
2 Premier League Queens Park Rangers         Hull City       0       1 16-8-2014
3 Premier League          Stoke City       Aston Villa       0       1 16-8-2014

我想做什么我创建一个新的df,其中包含团队名称的日期以及他们是否获胜。因此,我尝试:

rateclub <- function(df, club) {

df_m <- data.frame(Win=character(), 
               date=character(), 
               stringsAsFactors=FALSE)

     df_m$win <- ifelse(((df$TEAM1 == club && df$pointsH == 1)|| (df$TEAM2 == club && df$pointsA == 1)) , "W", "L") 
    df_matches$DATE <- df$DATUM
}

但是这给了我:

  Error in `$<-.data.frame`(`*tmp*`, "win", value = "L") : 
  replacement has 1 row, data has 0 

我的预期输出应为

"Manchester United", "L", 16-8-2014

2 个答案:

答案 0 :(得分:1)

错误是由于data.frame df_m的定义造成的,其中OP在零行中。除非代码发生显着变化,否则必须(并且更好)在开头指定所需的行。 在下面的代码中,相关行存储在df_rows中,data.frame df_m初始化为相应的行数。 最后,df_m中的日期仅从df中选择。

rateclub <- function(df, club) {
  df_rows <- which(df$TEAM1==club | df$TEAM2==club)
  df_m <- data.frame(matrix(nrow=length(df_rows),ncol=3),stringsAsFactors = F)
  colnames(df_m) <- c("team","win", "date")
  df_m$team <- club
  df_m$win <- ifelse(((df$TEAM1[df_rows] == club & df$pointsH[df_rows] == 1) | (df$TEAM2[df_rows] == club & df$pointsA[df_rows] == 1)) , "W", "L") 
  df_m$date <- df$DATUM[df_rows]
  return(df_m)
}

哪个收益率:

> rateclub(df, "Manchester United")
#               team win      date
#1 Manchester United   L 16-8-2014

希望这有帮助。

数据

text <- " COMPETITION               TEAM1             TEAM2 pointsH pointsA     DATUM
'Premier League'   'Manchester United'      'Swansea City'       0       1 16-8-2014
'Premier League' 'Queens Park Rangers'         'Hull City'       0       1 16-8-2014
'Premier League'          'Stoke City'       'Aston Villa'       0       1 16-8-2014"
df <- read.table(text=text, header=TRUE)

答案 1 :(得分:0)

dplyr版本 - 更清晰(至少对我而言;),立即转换整个表格:

text <- " COMPETITION               TEAM1             TEAM2 pointsH pointsA     DATUM
'Premier League'   'Manchester United'      'Swansea City'       0       1 16-8-2014
'Premier League' 'Queens Park Rangers'         'Hull City'       0       1 16-8-2014
'Premier League'          'Stoke City'       'Aston Villa'       0       1 16-8-2014"
df <- read.table(text=text, header=TRUE)

library(dplyr)
library(tidyr)
df %>%
  gather(where, team, TEAM1, TEAM2) %>%
  mutate(won = (where == "TEAM1" & pointsH == 1) | (where == "TEAM2" & pointsA == 1) ) %>%
  select(-starts_with("points"))

给予(where值可以改为主页/离开,但我不想让答案变得混乱):

     COMPETITION     DATUM where                team   won
1 Premier League 16-8-2014 TEAM1   Manchester United FALSE
2 Premier League 16-8-2014 TEAM1 Queens Park Rangers FALSE
3 Premier League 16-8-2014 TEAM1          Stoke City FALSE
4 Premier League 16-8-2014 TEAM2        Swansea City  TRUE
5 Premier League 16-8-2014 TEAM2           Hull City  TRUE
6 Premier League 16-8-2014 TEAM2         Aston Villa  TRUE