Question

我正在尝试阅读几个url文件。有没有人知道如何检查它是否可以打开网址，然后做一些事情？有时我会收到错误（失败=＆＃34;无法打开连接＆＃34;）。如果无法打开连接，我只想跳过它。

urlAdd=paste0(server,siteID,'.dly')
# Reading the whole data in the page
if(url(urlAdd)) {
  tmp <- read.fwf(urlAdd,widths=c(11,4,2,4,rep(c(5,1,1,1),31)))
}

但是这种情况失败了。

Answer 1

如果成功，则可以使用tryCatch返回表达式的值，如果有错误，则返回error参数的值。请参阅?tryCatch。

此示例查找一堆URL并下载它们。如果成功，tryCatch将返回readlines的结果，如果不成功，则返回NULL。NULL如果结果是next()，我们只需urls <- c('http://google.com', 'http://nonexistent.jfkldasf', 'http://stackoverflow.com') for (u in urls) { # I only put warn=F to avoid the "incomplete final line" warning # you put read.fwf or whatever here. tmp <- tryCatch(readLines(url(u), warn=F), error = function (e) NULL) if (is.null(tmp)) { # you might want to put some informative message here. next() # skip to the next url. } }到循环的下一部分。

tryCatch(raedlines(url(u), warn=F)

请注意，这将在任何错误上执行此操作，而不仅仅是“404 not found”类型错误。如果我写错了并写了readLines（错误lapply），它就会跳过所有内容，因为这也会导致错误。

编辑re：评论（next()正在使用，在哪里放置数据处理代码）。而不是lapply(urls, function (u) { tmp <- tryCatch(read.fwf(...), error = function (e) NULL) if (is.null(tmp)) { # read failed return() # or return whatever you want the failure value to be } # data processing code goes here. })，只是在读取成功时才进行处理。读取数据代码后放入数据处理代码。尝试类似：

lapply

如果读取失败，上面的函数将返回函数（仅影响lapply(urls, function (u) { tmp <- tryCatch(read.fwf(...), error = function (e) NULL) if (!is.null(tmp)) { # read succeeded! # data processing code goes here. } })的当前元素）。

或者您可以将其反转并执行以下操作：

NULL

将执行相同的操作（如果读取成功，它只会执行数据处理代码，否则会跳过整个代码块并返回pctlevel=group）。

失败 - ＆＃34;无法打开连接＆＃34;

1 个答案: