我正在尝试获取一个代码,以获取10月份所有的NBA分数。我希望代码可以尝试每个URL,以便结合日期(27-31)和30个团队。但是,由于并非所有团队每天都在玩,有些组合不会存在,所以我试图实现try函数来跳过不存在的URL,但我似乎无法弄明白。这是我到目前为止所写的内容:
install.packages("XML")
library(XML)
teams = c('ATL','BKN','BOS','CHA','CHI',
'CLE','DAL','DEN','DET','GS',
'HOU','IND','LAC','LAL','MEM',
'MIA','MIL','MIN','NOP','NYK',
'OKC','ORL','PHI','PHX','POR',
'SAC','SA','TOR','UTA','WSH')
october = c()
for (i in teams){
for (j in (c(27:31))){
url = paste("http://www.basketball-reference.com/boxscores/201510",
j,"0",i,".html",sep = "")
data <- try(readHTMLTable(url, stringsAsFactors = FALSE))
if(inherits(data, "error")) next
away_1 = as.data.frame(data[1])
colnames(away_1) = c("Players","MP","FG","FGA","FG%","3P","3PA","3P%","FT","FTA",
"FT%", "ORB","DRB","TRB","AST","STL","BLK","TO","PF","PTS","+/-")
away_1 = away_1[away_1$Players != "Reserves",]
away_1 = away_1[away_1$MP != "Did Not Play",]
away_1$team = rep(toupper(substr(names(as.data.frame(data[1]))[1],
5, 7)),length(away_1$Players))
away_1$loc = rep(i,length(away_1$Players))
home_1 = as.data.frame(data[3])
colnames(home_1) = c("Players","MP","FG","FGA","FG%","3P","3PA","3P%","FT","FTA",
"FT%", "ORB","DRB","TRB","AST","STL","BLK","TO","PF","PTS","+/-")
home_1 = home_1[home_1$Players != "Reserves",]
home_1 = home_1[home_1$MP != "Did Not Play",]
home_1$team = rep(toupper(substr(names(as.data.frame(data[2]))[1],
5, 7)),length(home_1$Players))
home_1$loc = rep(i,length(home_1$Players))
game = rbind(away_1,home_1)
october = rbind(october, game)
}
}
以下行上方和下方的所有内容似乎都有效:
data <- try(readHTMLTable(url, stringsAsFactors = FALSE))
if(inherits(data, "error")) next
我只需要正确格式化这两个。
答案 0 :(得分:0)
如何使用tryCatch
进行错误处理?
result = tryCatch({
expr
}, warning = function(w) {
warning-handler-code
}, error = function(e) {
error-handler-code
}, finally = {
cleanup-code
})
其中readHTMLTable
将用作主要部分(&#39; expr&#39;)。如果出现错误/警告,您可以简单地返回缺失值,然后在最终结果上省略缺失值。
答案 1 :(得分:0)
对于任何有兴趣的人,我在RCurl中使用url.exists想出来。只需在网址定义行后点击以下内容:
if(url.exists(url) == TRUE){...}