Splinter无头浏览器未在Airflow DAG中初始化

时间:2019-07-15 14:45:16

标签: python selenium airflow splinter

我正在使用Splinter设置网络测试,以每周一次在远程Ubuntu计算机上通过Airflow自动运行。在Ubuntu机器上的python shell中运行时,Web测试运行良好(不是测试失败),但是在Airflow PythonOperator中运行时,启动Splinter浏览器时Web测试失败。

除了浏览器启动外,我已经删除了所有内容,并且发生了相同的错误。我找到了some examples of using Selenium and Xvfb,但是当我实现了相同的代码来启动和停止虚拟显示时,错误消息没有改变。我尝试通过在DAG上使用并发限制来消除并列错误,但错误仍然存​​在。我还检查了我的geckodriver和firefox的版本,它们看起来还不错(并且当它们不在气流中运行时就可以运行)

这是DAG:

def browser_test(queries, **context): 
    b = Browser("firefox", headless = True)
    b.visit('http://www.google.com')
    b.quit()

with DAG(
    dag_id = "web_test",
    start_date = datetime(2019, 1, 1),
    schedule_interval = None,
    concurrency=1,
    ) as dag: 

    PythonOperator(
    task_id = "run_pull", 
    python_callable = browser_test, 
    provide_context = True,
                            )

抛出的错误很大,因此我将其简化为仅发出命令和原始异常(full is here);

joe@Ubuntu-VM1:~/airflow$ sudo airflow test web_test run_pull "2019-01-01"

Original exception was:
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 32, in <module>
    args.func(args)
  File "/usr/local/lib/python3.6/dist-packages/airflow/utils/cli.py", line 74, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/airflow/bin/cli.py", line 660, in test
    ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
  File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 73, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/airflow/models/__init__.py", line 1542, in run
    session=session)
  File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 69, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/airflow/models/__init__.py", line 1441, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.6/dist-packages/airflow/operators/python_operator.py", line 112, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.6/dist-packages/airflow/operators/python_operator.py", line 117, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/home/joe/ftp/files/Documents/Projects/Airflow/dags/dev_airflow.py", line 233, in browser_test
    b = Browser("firefox", headless = True)
  File "/usr/local/lib/python3.6/dist-packages/splinter/browser.py", line 63, in Browser
    return driver(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/splinter/driver/webdriver/firefox.py", line 65, in __init__
    timeout=timeout, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
    keep_alive=True)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 156, in __init__
    self.start_session(capabilities, browser_profile)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 245, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: invalid argument: can't kill an exited process

当只运行python3时,我能够导入DAG文件并运行功能browser_test而没有任何错误。

我希望DAG能够运行而不会引发任何错误,并且如果我打印b.title(),我希望它是“ Google”。老实说,我希望Airflow DAG中运行的任何东西都与在Python Shell中运行时的反应一样吗?

1 个答案:

答案 0 :(得分:0)

您可以使用BashOperator运行脚本吗?

是使用BashOperator的一个示例。
t1 = BashOperator(
    task_id='web_test',
    bash_command='python web_test.run_pull()',
    dag=dag)

可以在Airflow中直接运行Splinter吗?并没有回答您为什么遇到问题但可能符合您的要求?