我试图废弃此页here。
我需要使用分页从页面中检索诸如收藏夹之类的信息。为此,我需要在javascript上调用以下操作:
javascript:__doPostBack('ctl00$MainContent$lstProfileView$dataPagerNumeric2$ctl02$ctl00')
我已经测试了我的代码,但它返回空:
import re
import urlparse
import mechanize
from bs4 import BeautifulSoup
class ArchitectFinderScraper(object):
def __init__(self):
self.url = "http://www.guiadosquadrinhos.com/todas-capas-disponiveis"
self.br = mechanize.Browser()
self.br.addheaders = [
('User-agent', 'Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38. 2.1'),
('Accepting-encoding', 'gzip, deflate')]
def scrape_state_firms(self, state_item):
self.br.open(self.url)
s = BeaultifulSoup(self.br.response().read())
saved_form = s.find('form', id='form1').prettify()
self.br.select_form(nr=0)
self.br.form['ctl00$stateList'] = [state_item.name]
print '\n'.join(['%s:%s (%s)' % (c.name, c.value, c.disabled) for c in self.br.form.controls])
我正在关注此示例here。