在主题中,docker进程暂停并被杀死。 我的python项目运行bash脚本,其中一个部分是运行R脚本,它从Influxdb中提取数据然后处理它。当项目在短时间内获得数据时,例如1-5天,这不是问题。整个事情开始于更长的时间框架,如几周。它只是放慢速度以至于生成任何东西需要很长时间(我检查了日志)并最终被杀死。当R脚本撤回大约25mb的数据时,没关系,但70mb的数据并不那么容易。 Flask + bash + R可以同时使用太多内存吗?在docker
之外调用时不会出现此类问题Dockerfile:
FROM ubuntu
# Install requirements fot the flask app
RUN apt-get clean && apt-get update && DEBIAN_FRONTEND=noninteractive apt-get upgrade -y && DEBIAN_FRONTEND=noninteractive apt-get install -y \
python3 \
python3-pip \
r-base \
r-base-dev \
r-cran-rgl \
mutt \
git \
texlive-fonts-recommended
# Install requirements fot the flask app
RUN pip3 install -r ./requirements.txt
flask app snippet:
@app.route('/send', methods=['POST'])
def send():
path = os.path.dirname(os.path.realpath(__file__))
script = path + '/generate_pdf.sh'
address = str(request.form['email'])
start_date = convert_date(str(request.form['start_date']))
end_date = convert_date(str(request.form['end_date']))
command = [script, start_date, end_date, address]
subprocess.run(command)
return json.dumps({
'status': 'OK',
'message': 'The action is completed'
})
generate_pdf.sh:
#!/bin/bash
start_date="$1"
end_date="$2"
address="$3"
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
report_name="\"$DIR/my_document.pdf\""
R -e "rmarkdown::render('$DIR/generate_document.Rmd', output_file = $report_name)" --args "$start_date" "$end_date"
report_name="$DIR/my_document.pdf"
echo | mutt -s "Generated document" -a $report_name -- "$address"
out=$(rm $report_name)
R脚本代码段:
where.clause <- paste0("time >= '",
start.date,
"' AND time <= '",
as.character(as.Date(end.date) + days(1)),
"'")
con <- influxdbr::influx_connection(host = "localhost",
port = 8086,
user = "root",
pass = "root")
select.query <- paste0(
'id, name, surname, car, employment_status'
)
rows <- influx_select(con, db = 'my_db', select.query, from = 'workers',
where = where.clause)
rows <- as.data.frame(rows, stringsAsFactors = FALSE)
if(is.data.frame(rows) && nrow(rows) == 0) {
cat('No data could be obtained from the database.', sep = '\n')
knitr::knit_exit()
}
以下是我在执行应用程序时获得的日志,假设要撤消大约74mb的数据。
....
label: unnamed-chunk-4 (with options)
List of 3
$ echo : logi FALSE
$ message: logi FALSE
$ warning: logi FALSE
Success: (204) No Content
/app/generate_pdf.sh: line 8: 58 Killed
....
该应用程序在docker之外完美运行。
调用此命令rows <- influx_select
时,将以原始版本获取数据。在它被转换为数据帧之前,它的权重很大--24mb,70甚至更多。
我在docker中手动运行脚本,R脚本更进一步:
....
label: unnamed-chunk-8 (with options)
List of 4
$ echo : logi FALSE
$ message : logi FALSE
$ fig.align : chr "left"
$ fig.height: num 7
Quitting from lines 72-76 (generate_document.Rmd)
Error in system(paste(which, shQuote(names[i])), intern = TRUE, ignore.stderr = TRUE) :
cannot popen '/usr/bin/which 'pdfcrop' 2>/dev/null', probable reason 'Cannot allocate memory'
...