R:打印所有并行节点中出现的所有警告

时间:2017-09-14 08:58:49

标签: r

我开发了以下用于通过并行计算导入一系列压缩CSV的代码。

我的问题是:

  1. 某些ZIP文件(其中包含CSV)已损坏,无法打开。

  2. 执行parRapply后,我只能看到last.warning变量错误,因为我知道哪个CSV在每个节点都出现故障,但我看不到所有警告,一次只能看到1个。

  3. 所以:

    • 为了显示所有节点中所有警告的列表,我在考虑在代码中使用以下函数:

      warnings(DISPOIN_CSV_List <- parRapply(c1, DISPOIN_DIR_REL, parRaplly_Function))
      

      这会有用吗?

    • 此外,如何在应用该功能之前检查是否可以打开CSV,并为这些CSV创建一个空的data.frame。

    ## -----------------------------------------------------------------------------
    ## Packages
    ## -----------------------------------------------------------------------------
    
    # update.packages("RODBC")
    # update.packages("tidyverse")
    
    ## -----------------------------------------------------------------------------
    ## Libraries
    ## -----------------------------------------------------------------------------
    
    suppressMessages(require(RODBC))
    suppressMessages(require(tidyverse))
    suppressMessages(require(parallel))
    
    ## -----------------------------------------------------------------------------
    ## CMD: Command for DISPOIN's Directory Acquisition
    ## -----------------------------------------------------------------------------
    
    # shell(cmd = 'pushd "\\srvdiscsv\data" && dir *AL*.zip /b /s > D:\DISPOIN_Data_Directories.csv && popd')
    
    ## -----------------------------------------------------------------------------
    ## RODBC
    ## -----------------------------------------------------------------------------
    
    ## A) MariaDB Connection String
    
    con <- odbcConnect("MariaDB_Tornado24")
    
    invisible(sqlQuery(con, "USE dispoin;"))
    
    # B) Import R Data Directories from MariaDB
    
    DISPOIN_DIR_REL <- as_tibble(sqlFetch(con, "dispoin.t_DISPOIN_DIR_REL"))
    
    odbcClose(con)
    
    # C) Import Zipped CSV data into List of Dataframes, which latter on are compiled as a single dataframe by
    #    means of rbind
    
      # C.1) parRapply Function Initialization:
    
      parRaplly_Function <- function (DISPOIN_CSV_Row)
      {
        return(read_csv2(
          file = DISPOIN_CSV_Row,
          col_names = c(
            "SCADA",
            "TAG",
            "ID_del_AEG",
            "Descripcion",
            "Time_ON",
            "Time_OFF",
            "Delta_Time",
            "Comentario",
            "Es_Alarma",
            "Es_Ultima",
            "Comentarios"),
          col_types = cols(
            "SCADA" = "c",
            "TAG" = "c",
            "ID_del_AEG" = "c",
            "Descripcion" = "c",
            "Time_ON" = "c",
            "Time_OFF" = "c",
            "Delta_Time" = "c",
            "Comentario" = "c",
            "Es_Alarma" = "c",
            "Es_Ultima" = "c",
            "Comentarios" = "c"),
          locale = default_locale(),
          na = c("", " "),
          quoted_na = TRUE,
          quote = "\"",
          comment = "",
          trim_ws = TRUE,
          skip = 0,
          n_max = Inf,
          guess_max = min(1000, n_max),
          progress = FALSE))
      }
    
      # C.2) parallel Package: Environment Settings
    
      no_cores <- detectCores()
    
      c1 <- makeCluster(no_cores)
    
      invisible(clusterEvalQ(c1, library(readr)))
    
      setDefaultCluster(c1)
    
      # C.3) parRapply Function Application:
    
      DISPOIN_CSV_List <- parRapply(c1, DISPOIN_DIR_REL, parRaplly_Function)
    
      suppressWarnings(stopCluster(c1))
    
    # D) List's Tibbles Compilation into a single Tibble:
    
      DISPOIN_CSV <- do.call(rbind, DISPOIN_CSV_List)
    
    # E) Write Compiled Table into CSV:
    
      write_csv(
        DISPOIN_CSV, 
        path = file.path("D:/MySQL/R", "DISPOIN_CSV.csv"), 
        na = "\\N", 
        append = FALSE, 
        col_names = TRUE)
    
    # F) Data Cleaning: Environment Variable Removal
    
      rm(list=ls())
    

    解决方案1&amp; 2

    我在r-help邮件列表中问了同样的问题,这就是他们给我的答案:

    使用tryCatch()。

    ,而不是

    result <- read_csv2(file)
    

    使用

    result <- tryCatch(read_csv2(file), error=function(e) makeEmptyDataFrame(conditionMessage(e)))
    

    其中:

    • makeEmptyDataFrame(msg = NULL)是一个函数(您编写的),它返回一个没有行但具有正确列名和类型的data.frame。我用一个msg(消息)参数显示它,因为您可能希望将错误消息作为属性附加到它,以便您可以看到出错的地方。

0 个答案:

没有答案