Question

我正在运行此功能：

只要页面存在，它就可以正常工作。但是，如果我的一个代码没有关于该URL的任何数据，则会引发错误：

require(XML)
require(plyr)


getKeyStats_xpath <- function(symbol) {
  yahoo.URL <- "http://finance.yahoo.com/q/ks?s="
  html_text <- htmlParse(paste(yahoo.URL, symbol, sep = ""), encoding="UTF-8")

  #search for <td> nodes anywhere that have class 'yfnc_tablehead1'
  nodes <- getNodeSet(html_text, "/*//td[@class='yfnc_tablehead1']")

  if(length(nodes) > 0 ) {
    measures <- sapply(nodes, xmlValue)

    #Clean up the column name
    measures <- gsub(" *[0-9]*:", "", gsub(" \\(.*?\\)[0-9]*:","", measures))   

    #Remove dups
    dups <- which(duplicated(measures))
    #print(dups) 
    for(i in 1:length(dups)) 
      measures[dups[i]] = paste(measures[dups[i]], i, sep=" ")

    #use siblings function to get value
    values <- sapply(nodes, function(x)  xmlValue(getSibling(x)))

    df <- data.frame(t(values))
    colnames(df) <- measures
    return(df)
  } else {
    break
  }
}

我也添加了一条痕迹，并且在3号代码上发生了故障。

Error in FUN(X[[3L]], ...) : no loop for break/next, jumping to top level

我想像这样调用这个函数：

tickers <- c("QLTI",
"RARE",
"RCPT",
"RDUS",
"REGN",
"RGEN",
"RGLS")

tryCatch({
stats <- ldply(tickers, getKeyStats_xpath)
}, finally={})

基本上，如果股票代码没有数据，我想跳过它。

有人可以帮我解决这个问题吗？

Answer 1

扩展我的评论。这里的问题是您已将整个命令stats <- ldply(tickers, getKeyStats_xpath)括在tryCatch中。这意味着R将尝试从每个股票代码中获取关键统计数据。

相反，你想要的是尝试每个自动收报机。

为此，请为getKeyStats_xpath编写一个包装它的包装器tryCatch。您可以使用匿名函数在ldply内执行此操作，例如ldply(tickers, function (t) tryCatch(getKeyStats_xpath(t), finally={}))。请注意，无论退出条件如何，最终都会执行，因此最后= {}不执行任何操作（有关详情，请参阅r-faq的Advanced R或How to write try catch in R）。

如果出现错误，tryCatch会调用参数error中提供的函数。因此，这个代码仍然没有帮助，因为错误未处理（感谢rawr先前指出了这个）。如果使用llply代替

，也可以更轻松地检查输出

下面是使用这种方法和信息性错误处理的完整答案。

stats <- llply(tickers, 
    function(t) tryCatch(getKeyStats_xpath(t), 
        error=function(x) {
            cat("error occurred for:\n", t, "\n...skipping this ticker\n")
        }
    )
)
names(stats) <- tickers
lapply(stats, length)
#<snip>
#$RCPT
#[1] 0
# </snip>

截至目前，这对我有用，返回除上面代码块中列出的所有代码之外的所有代码的数据。

在plyr中使用tryCatch

1 个答案: