最近,我开始使用foreach
和doParallel
软件包将代码切换为并行执行。由于不同的线程相互干扰,因此使用futile.logger
进行记录的效果很差。因此,我开始使用ParallelLogger
,它即使在并行设置中也应处理日志记录。
不幸的是,我有两个无法独立解决的问题,也许是我做错了事,或者系统中有错误。
launchLogViewer
抛出line 40 did not have 6 elements
错误。这是MWE:
library(foreach)
library(doParallel)
library(ParallelLogger)
LOGGING_FILE_PATH <- "Parallel_MWE.log"
diff_time <- function(start_time) {
format(difftime(Sys.time(), start_time))
}
block_execution <- function(start_se, end_se) {
logInfo("Start Data Loading")
start_time_loading <- Sys.time()
Sys.sleep(2)
logInfo("Data Loading Done: ", diff_time(start_time_loading))
logInfo("Start Data Preprocessing")
start_time_computation <- Sys.time()
logInfo(paste("Start:", sprintf("%04d", start_se),
"End:", sprintf("%04d", end_se),
sep = " "
))
Sys.sleep(2)
logInfo("Computation Done: ", diff_time(start_time_computation))
logInfo("Start Data Postprocessing")
start_time_writing <- Sys.time()
Sys.sleep(2)
logInfo("Data Postprocessing Done: ", diff_time(start_time_writing))
logInfo("Overall time taken: ", diff_time(start_time_whole))
logInfo("Current allocated Memory: ", memory.size(), " MB\n")
}
registerLogger(createLogger(
name = "ParLogger",
threshold = "INFO",
appenders = list(
createConsoleAppender(
layout = layoutSimple
),
createFileAppender(
layout = layoutParallel,
fileName = LOGGING_FILE_PATH
)
)
))
logInfo("Start Programm")
start_time_whole <- Sys.time()
cluster <- makeCluster(detectCores())
registerDoParallel(cluster)
start <- 0100
end <- 9000
step <- 0100
foreach(i = seq(start, end, step), .packages = c("ParallelLogger")) %dopar% {
block_execution(i, (i + step))
}
stopCluster(cluster)
logInfo("Programm Done: ", format(difftime(Sys.time(), start_time_whole)))
clearLoggers()
这是结果日志文件的一部分,显示了一种错误情况:
2018-09-17 10:47:57 [Thread 4] INFO doParallel fun Overall time taken: 39.00498 secs
2018-09-17 10:47:57 [Thread 4] INFO doParallel fun Current allocated Memory: 42.01 MB
cs
2018-09-17 10:47:57 [Thread 1] INFO doParallel fun Overall time taken: 39.07554 secs
2018-09-17 10:47:57 [Thread 4] INFO doParallel fun Start Data Loading
2018-09-17 10:47:57 [Thread 2] INFO doParallel fun Data Postprocessing Done: 2.063492 secs
2018-09-17 10:47:57 [Thread 1] INFO doParallel fun Current allocated Memory: 42.01 MB
2018-09-17 10:47:57 [Thread 1] INFO doParallel fun Start Data Loading
39.10681 secs
2018-09-17 10:47:57 [Thread 3] INFO doParallel fun Data Postprocessing Done: 2.049974 secs
2018-09-17 10:47:57 [Thread 2] INFO doParallel fun Current allocated Memory: 42.01 MB
2018-09-17 10:49:15 [Thread 3] INFO doParallel fun Start: 7500 End: 7600
g
2018-09-17 10:49:15 [Thread 4] INFO doParallel fun Start: 7600 End: 7700
如您所见,有些线弯曲错误或被切成几部分。如果我删除这些有问题的行,launchLogViewer
就可以了。
那么我如何将具有多个线程的并行R脚本记录到文件中并登录到控制台?或者如何在不破坏输出的情况下使ParallelLogger
登录文件和控制台?
编辑:
在Linux系统下运行MWE会导致格式正确的日志文件。 因此,这似乎是Windows特有的问题。
答案 0 :(得分:0)
之所以发生“控制台中未显示日志”部分,是因为各个线程没有stdout通道,因此这不是错误。我与开发人员讨论了日志文件格式问题,结果发现ParallelLogger
没有事务管理,因此您无法对此主题进行任何操作。