在Docker容器中的无头Chrome中使用python中的Selenium

时间:2018-07-25 09:12:21

标签: python selenium docker jupyter-notebook jupyter

我正在从正式的jupyter sci-py映像(文档here,Dockerfile here)创建Dockerfile。

FROM jupyter/scipy-notebook

USER root

# bash instead of dash to use source
RUN ln -snf /bin/bash /bin/sh

RUN sudo apt-get update
RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install

USER jovyan

RUN pip install --upgrade pip \
 && pip install gspread \
 && pip install isort \
 && pip install jupyter_contrib_nbextensions \ 
 && pip install nbdime \
 && pip install pathlib \
 && pip install selenium \
 && nbdime extensions --enable

RUN jupyter contrib nbextension install --user

RUN jupyter nbextension enable autosavetime/main \
 && jupyter nbextension enable codefolding/edit \ 
 && jupyter nbextension enable code_prettify/isort \
 && jupyter nbextension enable scratchpad/main \
 && jupyter nbextension enable splitcell/splitcell \
 && jupyter nbextension enable table_beautifier/main \
 && jupyter nbextension enable code_prettify/2to3 \
 && jupyter nbextension enable init_cell/main \
 && jupyter nbextension enable nbextensions_configurator/tree_tab/main \
 && jupyter nbextension enable spellchecker/main \
 && jupyter nbextension enable toc2/main \
 && jupyter nbextension enable toggle_all_line_numbers/main \
 && jupyter nbextension enable varInspector/main

我正在使用此容器运行

docker run -v my_dir:/home/jovyan/work -p 8888:8888 -a stdin -a stdout -i -t my_image /bin/bash

我要挂载的目录包含chromedriver可执行文件。

当我打开Jupyter笔记本并运行以下代码时

import datetime
import os

import pandas as pd
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

chrome_path = '/home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests/chromedriver'
chrome_options = Options()  
chrome_options.add_argument("--headless")  
# chrome_options.binary_location = '/Applications/Google Chrome   Canary.app/Contents/MacOS/Google Chrome Canary'  
driver = webdriver.Chrome(executable_path=chrome_path, chrome_options=chrome_options)

我得到了错误

OSError: [Errno 8] Exec format error: '/home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests/chromedriver'

这些可能有助于跟踪错误:

  • (来自Jupyter笔记本)!pwd返回/home/jovyan/work/data_analysis/notebooks/sandbox/miguel/tests
  • (来自Jupyter笔记本)!ls返回chromedriver和其他文件
  • (来自Jupyter笔记本)!google-chrome --version返回Google Chrome 68.0.3440.75

我已经搜索了该错误,但是找不到答案。另外,如果有一种更简单/更好的方法来实现(通过Docker容器在Chrome中使用Selenium),我很乐意采用另一种方法。

1 个答案:

答案 0 :(得分:0)

使用jupyter/scipy-notebook作为基本图像可能可行,但是我使用了debian:stable

我使用$ touch script.py创建项目文件,并使用Selenium实例化Chrome浏览器实例(并执行测试请求以验证Selenium是否正常工作):

from selenium import webdriver

options = webdriver.chrome.options.Options()
options.add_argument("--no-sandbox")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options)

driver.get("https://httpstat.us/200")

if "200 OK" in driver.page_source:
    print('Selenium successfully opened with Chrome (under the Xvfb display) and navigated to "https://httpstat.us/200", you\'re all set!')

然后,我使用$ touch run.sh创建调用shell脚本。我想使用Xvfb在此容器中为Chrome浏览器实例创建X Windows服务器:

# Below is the reason for "-nolisten tcp" (this is not documented within Xvfb manpages)
# https://superuser.com/questions/855019/make-xvfb-listen-only-on-local-ip
Xvfb :99 -screen 0 640x480x8 -nolisten tcp &
python3 test.py

现在,我将创建以下Dockerfile

首先,我将安装Chrome,Xvfb和Python:

FROM debian:stable 
LABEL maintainer "Sean Pianka"

RUN apt-get update -y && apt-get install -y wget curl unzip libgconf-2-4
RUN apt-get update -y && apt-get install -y chromium xvfb python3 python3-pip 
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/

然后,我将创建我的项目的目录,安装Selenium(和/或我的项目的依赖项),然后将我的项目代码复制到映像中。

RUN mkdir -p /opt/app
WORKDIR /opt/app
RUN pip3 install selenium
## or install from dependencies.txt, comment above and uncomment below
#COPY requirements.txt .
#RUN pip3 install -r requirements.txt
COPY test.py .

最后,我将DISPLAY设置为Xvfb的开放显示端口,以供Xvfb创建的X Windows服务器使用,将run.sh脚本复制到映像中,并使用{{ 1}}。

/bin/bash

这完成了# Set display port and dbus env to avoid hanging ENV DISPLAY=:99 ENV DBUS_SESSION_BUS_ADDRESS=/dev/null # Bash script to invoke xvfb, any preliminary commands, then invoke project COPY run.sh . CMD /bin/bash run.sh ,现在,如果使用它的图像创建容器,您将在Dockerfile文件中看到以下输出:

  

Selenium已成功通过Chrome打开(在Xvfb显示屏下),并导航至“ https://httpstat.us/200”,一切就绪!

如果您想为Python 2 / Python 3和Chrome / Firefox的任何组合预写Dockerfile,请参阅my repository on GitHub,其中包含这些不同的test.py版本。