我正在使用Splinter设置网络测试,以每周一次在远程Ubuntu计算机上通过Airflow自动运行。在Ubuntu机器上的python shell中运行时,Web测试运行良好(不是测试失败),但是在Airflow PythonOperator中运行时,启动Splinter浏览器时Web测试失败。
除了浏览器启动外,我已经删除了所有内容,并且发生了相同的错误。我找到了some examples of using Selenium and Xvfb,但是当我实现了相同的代码来启动和停止虚拟显示时,错误消息没有改变。我尝试通过在DAG上使用并发限制来消除并列错误,但错误仍然存在。我还检查了我的geckodriver和firefox的版本,它们看起来还不错(并且当它们不在气流中运行时就可以运行)
这是DAG:
def browser_test(queries, **context):
b = Browser("firefox", headless = True)
b.visit('http://www.google.com')
b.quit()
with DAG(
dag_id = "web_test",
start_date = datetime(2019, 1, 1),
schedule_interval = None,
concurrency=1,
) as dag:
PythonOperator(
task_id = "run_pull",
python_callable = browser_test,
provide_context = True,
)
抛出的错误很大,因此我将其简化为仅发出命令和原始异常(full is here);
joe@Ubuntu-VM1:~/airflow$ sudo airflow test web_test run_pull "2019-01-01"
Original exception was:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 32, in <module>
args.func(args)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/bin/cli.py", line 660, in test
ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 73, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/models/__init__.py", line 1542, in run
session=session)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 69, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/models/__init__.py", line 1441, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.6/dist-packages/airflow/operators/python_operator.py", line 112, in execute
return_value = self.execute_callable()
File "/usr/local/lib/python3.6/dist-packages/airflow/operators/python_operator.py", line 117, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/home/joe/ftp/files/Documents/Projects/Airflow/dags/dev_airflow.py", line 233, in browser_test
b = Browser("firefox", headless = True)
File "/usr/local/lib/python3.6/dist-packages/splinter/browser.py", line 63, in Browser
return driver(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/splinter/driver/webdriver/firefox.py", line 65, in __init__
timeout=timeout, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
keep_alive=True)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 156, in __init__
self.start_session(capabilities, browser_profile)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 245, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 314, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: invalid argument: can't kill an exited process
当只运行python3时,我能够导入DAG文件并运行功能browser_test而没有任何错误。
我希望DAG能够运行而不会引发任何错误,并且如果我打印b.title(),我希望它是“ Google”。老实说,我希望Airflow DAG中运行的任何东西都与在Python Shell中运行时的反应一样吗?
答案 0 :(得分:0)
您可以使用BashOperator运行脚本吗?
是使用BashOperator的一个示例。t1 = BashOperator(
task_id='web_test',
bash_command='python web_test.run_pull()',
dag=dag)
可以在Airflow中直接运行Splinter吗?并没有回答您为什么遇到问题但可能符合您的要求?