driver.page_source硒代码以前在崩溃的环境中工作,并且仅存在程序的备份文件。该代码是 使用geckodriver linux64 v0.21.0的selenium 3.13.0。在没有环境的情况下,我没有确切的硒版本号 错误。代码在尝试执行driver.page_source时失败。
此尝试的sec.gov网站没有代理。我不知道有没有 代码或Selenium版本有问题。如果您没有看到 下面的代码有问题,您可能建议您使用已知的硒或geckodriver的早期版本,而不会出现此错误。 预先感谢您的帮助。
代码在这里
import time
import sys
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
def lookup_Type(driver, this_url):
driver.get(this_url)
box = driver.find_element_by_id("type")
box.send_keys('10-K')
box.send_keys(Keys.ENTER)
return
def init_driver():
driver = webdriver.Firefox()
driver.wait = WebDriverWait(driver, 5)
return driver
these_many_stocks = (sys.argv[1]).split(',')
for this_symbol in these_many_stocks :
browser = webdriver.Firefox()
driver = init_driver()
search_url = "https://www.sec.gov/cgi-bin/browse-edgar?CIK=" + this_symbol + "&owner=exclude&action=getcompany&Find=Search"
lookup_Type(driver, search_url)
time.sleep(10)
this_page = driver.page_source
print this_page
虚拟环境中安装的软件包: 套件版本
beautifulsoup4 4.6.0
bs4 0.0.1
lxml 4.2.2
点10.0.1
pkg-resources 0.0.0
prettytable 0.7.2
硒3.13.0
设置工具39.2.0
轮0.31.1
安装的GECKODRIVER: geckodriver-v0.21.0-linux64 放置此webdriver etlibs / selenium / webdriver / firefox / amd64 / geckodriver 之前在etlibs / bin / geckodriver 在这两种配置中,webdriver都会打开页面,但尝试失败:driver.page_source
RUN AND ERROR MESSAGES:
(etlibs) james@james-Noir-et:~/Documents/et-alt$ python get_xbrl_files.py CSCO
Traceback (most recent call last): File "get_xbrl_files.py", line 34, in <module>
this_page = driver.page_source
File "/home/james/Documents/et-proj/etlibs/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 678, in page_source
return self.execute(Command.GET_PAGE_SOURCE)['value']
File "/home/james/Documents/et-proj/etlibs/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 318, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/james/Documents/et-proj/etlibs/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 472, in execute
return self._request(command_info[0], url, body=data)
File "/home/james/Documents/et-proj/etlibs/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 496, in _request
resp = self._conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
(etlibs) james@james-Noir-et:~/Documents/et-alt$