Python:Scrape JavaScript表并将结果保存为csv文件而无需打开浏览器

时间:2017-06-16 13:18:59

标签: python csv web-scraping

Python:抓取网络表并将数据保存到CSV文件 以下代码运行良好,但如何在不打开浏览器的情况下实现同样的目标?即在地下运行过程。代码如下;

import selenium.webdriver as webdriver
import contextlib
import csv
import json

@contextlib.contextmanager
def quitting(browser):
    yield browser
    browser.close()
    browser.quit()

with quitting(webdriver.Chrome()) as driver:
    url = "https://fantasy.premierleague.com/a/statistics/total_points"
    driver.get(url)
    id = 1;
    data = []
    idlist = [id]
    for tr in driver.find_elements_by_xpath('//table[@class="ism-table ism-table--el"]//tr'):
        tds = tr.find_elements_by_tag_name('td')
        if tds:
            data.append([id]+[td.text for td in tds])
            #data[0] = id
            id = id+1
            #idlist = [id+1]
            #n = len(data)
            outfile=open('./result.csv','w')
            wr = csv.writer(outfile, dialect='excel')
            wr.writerows(data)
            print(data)

2 个答案:

答案 0 :(得分:0)

您可以使用PhantomJS作为驱动程序而不是chrome来打开无头浏览器。你可以从这里下载它:http://phantomjs.org/download.html,将exe添加到Python35-32文件夹,然后你就像使用chrome驱动程序一样使用它,但是你写了:

with quitting(webdriver.PhantomJS()) as driver:

答案 1 :(得分:0)

最快的方法是用phantomJS

替换chrome驱动程序
with quitting(webdriver.PhantomJS()) as driver:
    # initiates a phantomjs headless browser browser, runs underground


with quitting(webdriver.Chrome()) as driver:
    # initiates a chrome browser the browser is visible on the screen