Question

加载库

library(engsoccerdata)
library(dplyr)
library(lubridate)

从英格兰联赛数据中提取利物浦数据

england$Date <- ymd(england$Date)
Liverpool.home <- england %>% filter(Date > '2001-08-01', home == 'Liverpool')
Liverpool.away <- england %>% filter(Date > '2001-08-01', visitor == 'Liverpool')

制作变量点

Liverpool.home$points = 0

for(i in 1:nrow(Liverpool.home)){

  if(Liverpool.home[i,]$result == 'H'){
    Liverpool.home[i,]$points = 3
  }
  else if(Liverpool.home[i,]$result == 'D'){
    Liverpool.home[i,]$points = 1
  }

}

我知道如何使用apply函数是stackoverflow中真正无聊和常见的问题，但是我无法使用apply函数解决这个问题。有什么方法吗？：）

Answer 1

因此，您希望将具有字符类型的一个列重新编码为整数列。其中一个选项就是使用ifelse，它是矢量化的，在这种情况下使用方便，而且您不想使用apply来循环matrix ：

Liverpool.home$points <- with(Liverpool.home, ifelse(result == "H", 3, 
                                                     ifelse(result == "D", 1, 0)))

head(Liverpool.home[c("result", "points")])

#  result points
#1      A      0
#2      A      0
#3      H      3
#4      D      1
#5      H      3
#6      H      3

Answer 2

<强> dplyr

函数case_when（“if和else ifs的向量化集合”）来自{CAS} WHEN语句的dplyr等价物。我们需要在.$

中使用mutate.

library(dplyr)
Liverpool.home %>% 
  mutate(points = case_when(.$result == 'H' ~ 3,
                            .$result == 'D' ~ 1,
                            TRUE ~ 0))

<强> sqldf

来自sqldf的SQL中的CASE WHEN语句：

library(sqldf)
df <- sqldf('SELECT result, 
                     CASE WHEN result = "H" THEN 3 
                          WHEN result = "D" THEN 1
                          ELSE 0
                     END AS points
             FROM [Liverpool.home]')
head(df)

输出：

  result points
1      A      0
2      A      0
3      H      3
4      D      1
5      H      3
6      H      3

Answer 3

试试这个。

transform(Liverpool.home, points = 3 * (result == "H") + (result == "D"))

是否有任何代码使用apply来优化这个？

加载库

从英格兰联赛数据中提取利物浦数据

制作变量点

3 个答案: