Python用ajax分页抓取aspnet页面

时间:2015-09-15 23:34:55

标签: python web-scraping

我试图废弃此页here

我需要使用分页从页面中检索诸如收藏夹之类的信息。为此,我需要在javascript上调用以下操作:

javascript:__doPostBack('ctl00$MainContent$lstProfileView$dataPagerNumeric2$ctl02$ctl00')

我已经测试了我的代码,但它返回空:

import re
import urlparse
import mechanize

from bs4 import BeautifulSoup


class ArchitectFinderScraper(object):
    def __init__(self):
        self.url = "http://www.guiadosquadrinhos.com/todas-capas-disponiveis"
        self.br = mechanize.Browser()
        self.br.addheaders = [
            ('User-agent', 'Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38.    2.1'),
            ('Accepting-encoding', 'gzip, deflate')]

    def scrape_state_firms(self, state_item):
        self.br.open(self.url)

        s = BeaultifulSoup(self.br.response().read())
        saved_form = s.find('form', id='form1').prettify()

        self.br.select_form(nr=0)
        self.br.form['ctl00$stateList'] = [state_item.name]

        print '\n'.join(['%s:%s (%s)' % (c.name, c.value, c.disabled) for c in self.br.form.controls])

我正在关注此示例here

0 个答案:

没有答案