dryscrape在容器高山

时间:2018-03-07 12:20:25

标签: python-2.7 docker dockerfile alpine

今天我在一个高山容器里用dryscrape做一个python脚本。

这是我的Dokerfile:

FROM alpine:3.7

RUN apk add --update bash &&\
    apk update &&\
    apk upgrade

RUN apk add --no-cache python-dev ;\
    apk add --no-cache python

RUN apk add --no-cache py-pip &&\
    apk add --no-cache linux-headers &&\
    apk add --no-cache texinfo &&\
    apk add --no-cache gcc &&\
    apk add --no-cache g++ &&\
    apk add --no-cache gfortran &&\
    apk add --no-cache libxml2-dev &&\
    apk add --no-cache xmlsec-dev &&\
    apk add --no-cache py-requests &&\
    apk add --no-cache make &&\
    apk add --no-cache qt-dev

RUN pip install beautifulsoup4 &&\
    pip install requests &&\
    pip install lxml &&\
    pip install html5lib &&\
    pip install urllib3 &&\
    pip install dryscrape

RUN apk add --no-cache icu-libs &&\
    apk add --no-cache git &&\
    git clone "https://github.com/niklasb/webkit-server.git" &&\
    cd webkit-server &&\
    python setup.py install


# prepare le shell
CMD ["bash"]
WORKDIR "/root"

我忘记了事情,因为当我开始dryscrape.Session()

时出现此错误
File "/usr/lib/python2.7/site-packages/webkit_server.py", line 427, in __init__
raise WebkitServerError("webkit-server failed to start. Output:\n" + err)
webkit_server.WebkitServerError: webkit-server failed to start. Output:
webkit_server: cannot connect to X server

你知道为什么我会收到这个错误吗?谢谢大家

1 个答案:

答案 0 :(得分:0)

您需要在容器内运行X服务器才能使用dryscrape.Session()

在Alpine平台上实现这一目标的最佳方法是安装并运行xvfb(X虚拟帧缓冲区的简称)。

一些例子:

1。 Dockerfile的结尾是:

RUN apk add --no-cache icu-libs &&\
    apk add --no-cache git &&\
    git clone "https://github.com/niklasb/webkit-server.git" &&\
    cd webkit-server &&\
    python setup.py install

RUN apk add --no-cache xvfb
ADD start_script.sh /root/start_script.sh

# prepare le shell
CMD ["/root/start_script.sh"]
WORKDIR "/root"

2。 start_script.sh是:

#!/bin/sh
Xvfb :00 &
export DISPLAY=:00
bash

现在您可以继续dryscrape.Session()

$ docker run -ti <image_id>
bash-4.4# ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 {start_script.sh} /bin/sh /root/start_script.sh
    7 root       0:00 Xvfb :00
    8 root       0:00 bash
   11 root       0:00 ps aux
bash-4.4# python
Python 2.7.14 (default, Dec 14 2017, 15:51:29) 
[GCC 6.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dryscrape
>>> dryscrape.Session()
<dryscrape.session.Session object at 0x7f1601ea4ad0>
>>> 
相关问题