Python:抓取网络表并将数据保存到CSV文件 :以下代码运行良好,但如何在不打开浏览器的情况下实现同样的目标?即在地下运行过程。代码如下;
import selenium.webdriver as webdriver
import contextlib
import csv
import json
@contextlib.contextmanager
def quitting(browser):
yield browser
browser.close()
browser.quit()
with quitting(webdriver.Chrome()) as driver:
url = "https://fantasy.premierleague.com/a/statistics/total_points"
driver.get(url)
id = 1;
data = []
idlist = [id]
for tr in driver.find_elements_by_xpath('//table[@class="ism-table ism-table--el"]//tr'):
tds = tr.find_elements_by_tag_name('td')
if tds:
data.append([id]+[td.text for td in tds])
#data[0] = id
id = id+1
#idlist = [id+1]
#n = len(data)
outfile=open('./result.csv','w')
wr = csv.writer(outfile, dialect='excel')
wr.writerows(data)
print(data)
答案 0 :(得分:0)
您可以使用PhantomJS作为驱动程序而不是chrome来打开无头浏览器。你可以从这里下载它:http://phantomjs.org/download.html,将exe添加到Python35-32文件夹,然后你就像使用chrome驱动程序一样使用它,但是你写了:
with quitting(webdriver.PhantomJS()) as driver:
答案 1 :(得分:0)
最快的方法是用phantomJS
替换chrome驱动程序with quitting(webdriver.PhantomJS()) as driver:
# initiates a phantomjs headless browser browser, runs underground
with quitting(webdriver.Chrome()) as driver:
# initiates a chrome browser the browser is visible on the screen