我使用Airflow(docker容器)运行R脚本。我收到以下错误。
[2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO - *** caught segfault ***
[2019-09-19 07:03:26,500] {{bash_operator.py:127}} INFO - address 0x55cf00000000, cause 'memory not mapped'
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO -
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO - Traceback:
[2019-09-19 07:03:26,501] {{bash_operator.py:127}} INFO - 1: is.data.frame(x)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO - 2: FUN(X[[i]], ...)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO - 3: lapply(.x, .f, ...)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO - 4: map(result, subset_rows, i)
[2019-09-19 07:03:26,502] {{bash_operator.py:127}} INFO - 5: `[.tbl_df`(x, ind, , drop = FALSE)
[2019-09-19 07:03:26,503] {{bash_operator.py:127}} INFO - 6: x[ind, , drop = FALSE]
[2019-09-19 07:03:26,503] {{bash_operator.py:127}} INFO - 7: FUN(X[[i]], ...)
[2019-09-19 07:03:26,504] {{bash_operator.py:127}} INFO - 8: lapply(split(x = seq_len(nrow(x)), f = f, drop = drop, ...), function(ind) x[ind, , drop = FALSE])
[2019-09-19 07:03:26,505] {{bash_operator.py:127}} INFO - 9: split.data.frame(es6, (0:(nrow(es6) - 1)%/%50))
[2019-09-19 07:03:26,505] {{bash_operator.py:127}} INFO - 10: split(es6, (0:(nrow(es6) - 1)%/%50))
[2019-09-19 07:03:26,506] {{bash_operator.py:127}} INFO - An irrecoverable exception occurred. R is aborting now ...
[2019-09-19 07:03:27,087] {{bash_operator.py:127}} INFO - /tmp/airflowtmpuj8lcw3e/web_etl_bf_10_days7wpo7bvb: line 1: 1140 Segmentation fault (core dumped) Rscript /usr/local/airflow/dags/scripts/r/etl_web_api_by_create_time.R -d "2019-09-05 00:00:00+00:00"
[2019-09-19 07:03:27,088] {{bash_operator.py:131}} INFO - Command exited with return code 139
错误代码为split(es6, (0:(nrow(es6) - 1)%/%50))
。数据帧es6大约有1096行和20列。
我有时无法通过Airflow重现成功而失败的错误。 (而且,当我通过Rstudio Sever运行代码时,代码就可以工作。)
我怀疑服务器内存不足可能是原因。我的linux服务器总共有8GB内存。当我在运行任务时检查内存时,它有大约1700MB的可用空间(使用free -m
命令)。
我在互联网上搜索,有人认为这种错误可能是由于该函数的错误引起的,即split
。
编辑:
更改为split(as.data.frame(es6), (0:(nrow(es6)-1) %/% 50))
之后。新的日志:
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - *** caught segfault ***
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - address 0x55f600000000, cause 'memory not mapped'
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO -
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - Traceback:
[2019-09-19 09:17:22,652] {{bash_operator.py:127}} INFO - 1: dim(xj)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 2: `[.data.frame`(x, ind, , drop = FALSE)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 3: x[ind, , drop = FALSE]
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 4: FUN(X[[i]], ...)
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 5: lapply(split(x = seq_len(nrow(x)), f = f, drop = drop, ...), function(ind) x[ind, , drop = FALSE])
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 6: split.data.frame(as.data.frame(es6), (0:(nrow(es6) - 1)%/%50))
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - 7: split(as.data.frame(es6), (0:(nrow(es6) - 1)%/%50))
[2019-09-19 09:17:22,653] {{bash_operator.py:127}} INFO - An irrecoverable exception occurred. R is aborting now ...
[2019-09-19 09:17:23,179] {{bash_operator.py:127}} INFO - /tmp/airflowtmp9jy2rurg/web_etl_bf_7_days4x9d7xqz: line 1: 1220 Segmentation fault (core dumped) Rscript /usr/local/airflow/dags/scripts/r/etl_web_api_by_create_time.R -d "2019-09-10 00:00:00+00:00"
[2019-09-19 09:17:23,179] {{bash_operator.py:131}} INFO - Command exited with return code 139