为什么col_search有时会起作用,有时却不会起作用,无论是否起作用

时间:2019-11-14 15:59:39

标签: r taxonomy

我有一个要验证的物种列表(约2000种):同义词,拼写,最近的名字等。

这是一个子集:

data <- c("Abalistes stellatus", "Abudefduf abdominalis", "Abudefduf bengalensis", "Abudefduf concolor", "Abudefduf conformis", "Abudefduf luridus", "Abudefduf notatus", "Abudefduf saxatilis", "Abudefduf septemfasciatus", Abudefduf sexfasciatus", "Abudefduf sordidus", "Abudefduf sparoides", "Abudefduf troschelii", "Abudefduf vaigiensis", "Abudefduf whitleyi", "Acanthaluteres brownii", "Acanthaluteres spilomelanurus", "Acanthaluteres vittiger", "Acanthemblemaria crockeri" ,"Acanthemblemaria hancocki", "Acanthemblemaria spinosa", "Acanthistius cinctus", "Acanthistius ocellatus", "Acanthistius pardalotus", "Acanthistius patachonicus" ,"Acanthistius sebastoides", "Acanthistius serratus", "Acanthochromis polyacanthus", "Acanthopagrus australis", "Acanthopagrus latus", "Acanthostracion polygonius", "Acanthostracion quadricornis",  "Acanthurus achilles", "Acanthurus albipectoralis", "Acanthurus auranticavus", "Acanthurus bahianus", "Acanthurus bariene", "Acanthurus blochii", "Acanthurus chirurgus", "Acanthurus coeruleus", "Acanthurus dussumieri", "Acanthurus fowleri", "Acanthurus gahhm", "Acanthurus grammoptilus", "Acanthurus guttatus", "Acanthurus leucocheilus", "Acanthurus leucopareius", "Acanthurus leucosternon", "Acanthurus lineatus" ,"Acanthurus maculiceps")

我使用以下功能在COL(生命目录)数据库中搜索相应的名称:

check_sp_name <- function(sp_list){
  # takes a list of species name that we want to check
  verified_names <- c()
  for (i in 1:length(sp_list)) {
    x <- NA
    x <- col_search(name = sp_list[i])
    x <- x[[1]]
    if (nrow(x)==0) {
      verified_names <- append(verified_names, "Non observation")
    } else {
      if (sum(x$status == "accepted name") != 0) {
        y <- x$name[x$status == "accepted name" & x$rank == "species"]
      } else if (sum(x$status == "synonym") != 0) {
        y <- x$acc_name[x$status == "synonym" & x$rank == "species"]
      } else if (sum(x$status == "provisionally accepted name") != 0) {
        y <- x$name[x$status == "provisionally accepted name" & x$rank == "species"]
      } 
      y <- ifelse(length(y) == 0, "infraspecies", y)
      y <- ifelse(length(y) > 1, y[y == sp_list[i]], y)

      verified_names  <- append(verified_names, y) # list of verified species names
    }
    print(i)
  } 
  verified_inputs <- data.frame(input_names = sp_list, output_names = verified_names)

  return(verified_inputs)
  }

问题:

昨天我尝试了代码,但一切正常,但是今天早晨,尽管我做了任何更改,但出现了此错误:

  

if(nrow(x)== 0){时的错误:长度为零   另外:警告消息:   找不到COL分类单元

我尝试了单独填充x的行,并且行得通...此外,该错误不会每次都在同一迭代中发生... 我整天都在苦苦挣扎。

我的一个假设是,COL网站目前不稳定。实际上,首页上会显示一条消息:https://www.catalogueoflife.org/

  • 这可能是我收到此错误的原因吗?
  • 有人想绕过它吗?

我还不得不提到我尝试使用GBIF数据库,但是它不像COL那样最新。

1 个答案:

答案 0 :(得分:0)

在github上,sckott还提到了最近的潜在速率限制,并建议使用Sys.sleep:

col_search_sleep <- function(...) {
  col_search(...)
  Sys.sleep(1)
}

lapply(vector_of_names, col_search_sleep)

有关更多信息,请阅读https://github.com/ropensci/taxize/issues/785