用图R创建xlsx的最快方法

时间:2017-12-28 16:08:03

标签: r excel performance ggplot2 xlsx

我有数据列表和我希望两个写入xlsx文件的图表列表(每个元素分隔表单)。示例数据:

require(ggplot2)
require(data.table)

n <- 10
N <- 100

dtList <- lapply(1:n, function(x) data.table(sample(1e6, N), 1:N))
names(dtList) <- 1:n
plots <- lapply(dtList, function(x) ggplot(x, aes(y = V1, x = V2)) + geom_line())

目前我使用的是openxlsx,但对于多个图表来说速度非常慢:

require(openxlsx)
wb <- createWorkbook()
modifyBaseFont(wb, fontSize = 10)

writeXlsx <- function(x, sName) {
  addWorksheet(wb, sName, gridLines = FALSE)
  writeData(wb, sName, x = x, xy = c(1, 1))
  print(plots[[sName]])
  insertPlot(wb, sName, width = 19, height = 9, dpi = 200, units = "cm",
             startRow = 2, startCol = 5)
}

system.time(
sapply(seq_along(dtList), function(x) {
  writeXlsx(dtList[[x]], names(dtList)[[x]])
})
) # ~ 17.00 sek

openXL(wb)

我怎样才能提高速度呢?是否有更好的package来完成这项工作?

1 个答案:

答案 0 :(得分:1)

一种选择是使用更简单的图形。例如,将绘图更改为base图形,例如:

plots <- lapply(dtList, function(x) plot(x$V2, x$V1, type = 'l'))

将xlsx的创建时间从〜7.72秒减少到了约0.72秒(原始代码现在比以前更快了),快了约10倍。

当需要ggplot图形时,我修改了insertPlot函数以接受这种类型的对象并将其保存到文件中,而无需在R会话中进行打印(使用ggsave):

insertggPlot <- function(wb, sheet, width = 6, height = 4, xy = NULL,
                        startRow = 1, startCol = 1, fileType = "png",
                        units = "in", dpi = 300, PLOT) {
  od <- getOption("OutDec")
  options(OutDec = ".")
  on.exit(expr = options(OutDec = od), add = TRUE)
  if (!"Workbook" %in% class(wb)) stop("First argument must be a Workbook.")
  if (!is.null(xy)) {
    startCol <- xy[[1]]
    startRow <- xy[[2]]
  }
  fileType <- tolower(fileType)
  units <- tolower(units)
  if (fileType == "jpg") fileType = "jpeg"
  if (!fileType %in% c("png", "jpeg", "tiff", "bmp")) 
    stop("Invalid file type.\nfileType must be one of: png, jpeg, tiff, bmp")
  if (!units %in% c("cm", "in", "px")) 
    stop("Invalid units.\nunits must be one of: cm, in, px")
  fileName <- tempfile(pattern = "figureImage",
                       fileext = paste0(".", fileType))
  ggsave(plot = PLOT, filename = fileName, width = width, height = height,
         units = units, dpi = dpi)
  insertImage(wb = wb, sheet = sheet, file = fileName, width = width, 
              height = height, startRow = startRow, startCol = startCol, 
              units = units, dpi = dpi)
}

使用此功能可将时间减少到2瑞典克朗左右。