我正在回复几天前我问过的一个问题。我会编辑另一个问题,但这个修订版本不同,可以单独使用。
如何在没有 - 在我目前的理解中 - 没有必要的迭代的函数中添加进度条。我的函数使用一些赋值和聚合修改大数据。表(> 2,500,000个观察值)。无论如何,在五百万个模糊时间内使用as.POSIXct
需要一分钟,因此,我想要一个进度指示器:
clean_tables <- function(x) {
headers <- c("TIMESTAMP","RECORD","CO2","H2O","LI7000Pres","Solenoid","AvgCO2In","AVGH2OIn",
"AvgCO2Out","AvgH2OOut","QIn_Avg","AirTempIN","AirTempOUT","Spare","DeltaT",
"DeltaCO2","DRIVEmv","FanDrive","HWA","Vel","Flow","QOut_Avg","QLine_Avg","AvgDT",
"IRTC_Avg","SoilTemp","BenchTemp","CO2error","CO2mvinc","CO2mvdrive")
pb <- txtProgressBar(style = 3)
setTxtProgressBar(pb, length(x))
colnames(x) <- headers
x <- unique(x)
x <<- x[order(RECORD)]
x[, TIMESTAMP := as.POSIXct(TIMESTAMP, format = "%m/%d/%Y %H:%M")]
x_1_minute <<- x[, lapply(.SD, mean), by = TIMESTAMP]
x_15_minute <<- data.table(aggregate.ts(x_1_minute, 1/15, mean))
x_15_minute[, TIMESTAMP := as.POSIXct(TIMESTAMP, origin = '1970-01-01')]
}
clean_tables(big_table)
当我运行该功能时,进度条将仅显示0%,因为没有迭代和后续更新进度条。如果我将进度条添加到lapply
循环 - 或者为此而创建额外的循环 - 它可能效率较低。因此,整个功能的一个指标是我的目标(实际更新)。有什么想法吗?
谢谢。