我需要一些关于学习python web scraping的小项目的帮助。
Traceback (most recent call last):
File "ridi_find.py", line 5, in <module>
driver = webdriver.Chrome(chromedriver)
File "/home/ubuntu/play_python/venv/lib/python3.5/site- packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__
desired_capabilities=desired_capabilities)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 92, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 179, in start_session
response = self.execute(Command.NEW_SESSION, capabilities)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/home/ubuntu/play_python/venv/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(Driver info: chromedriver=2.9.248304,platform=Linux 4.4.0-53-generic x86_64)
我安装了chromedriver(linux64)并在ec2 AWS服务器(ubuntu)中使用python 3.5.2。
这是源代码。 但它没有用。
from selenium import webdriver
import pandas as pd
chromedriver = '/home/ubuntu/play_python/venv/bin/chromedriver'
driver = webdriver.Chrome(chromedriver)
driver.get('http://ridibooks.com/')
find_some = input("what do you want to know?")
find_some = find_some + '\n'
search = driver.find_element_by_css_selector("input[id='book_search_input']")
search.send_keys(find_some)
searches = driver.find_element_by_id('books_contents')
book_lists = []
for l in searches.find_elements_by_css_selector("span.title_text"):
book_lists.append(l.text)
easy_index = pd.Series(0, index = range(1, len(book_lists) +1))
book_series = pd.Series(book_lists, index = easy_index.index)
print(book_series)
driver.quit()
有人可以帮我解决这个问题吗?
python 3.5.2
Ubuntu 16.04.1 LTS
Chromedriver_linux64
答案 0 :(得分:2)
我正在研究类似的问题,看起来你需要一个像xvfb这样的“虚假”x环境:
sudo yum install xorg-x11-server-Xvfbunzip
wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/2.10/chromedriver_linux64.zip && sudo unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/;
其次,看起来AWS没有提供某些库: https://forums.aws.amazon.com/message.jspa?messageID=713847
将以下内容添加到/etc/yum.repos.d/centos.repo
[CentOS-base]
name=CentOS-6 - Base
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#released updates
[CentOS-updates]
name=CentOS-6 - Updates
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=updates
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#additional packages that may be useful
[CentOS-extras]
name=CentOS-6 - Extras
mirrorlist=http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=extras
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
下次运行:
sudo rpm --import http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
然后终于:
sudo yum install GConf2
编辑:
我的错,但过程类似。由于您使用的是ubuntu,只需更改您获取软件包的位置:
sudo apt-get install python-pip
sudo apt-get install xvfb xserver-xephyr vnc4server
sudo pip install pyvirtualdisplay
我的代码看起来像这样:
display = Display(visible=0, size=(1300, 1080))
display.start()
driver = webdriver.Chrome()
driver.set_window_size(1300, 1080)
driver.get("https://www.google.com")
...
driver.close()
driver.quit()
display.stop()