我正在尝试登录this page。
br = mechanize.Browser(factory=mechanize.RobustFactory())
br.set_cookiejar(cj)
current_page = br.open(LOGIN_URL)
soup = BeautifulSoup(current_page.get_data())
current_page.set_data(soup.prettify())
br.set_response(current_page)
print soup.findAll('form')
assert br.viewing_html()
for f in br.forms():
print f.name
但即使BeautifulSoup完美地找到了表单,它也会为表单打印None。有人有什么想法吗?
答案 0 :(得分:1)
这样的事情会起作用:
from bs4 import BeautifulSoup
import mechanize
import cookielib
br = mechanize.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
host = 'https://order.papajohns.com/secure/signin/frame.html?destination=http%3a%2f%2forder.papajohns.com%2findex.html%3fsite%3dWEB%26dclid%3d%2525n-2543611-4121096-71899047-246709315-0%26esvt%3d336192-GOUSe339376223%26esvq%3dpapa%2520johns%26esvadt%3d999999-0-3934985-1%26esvcrea%3d41751468573%26esvplace%26esvd%3dc%26esvaid%3d30536%26gclid%3dCI2psOHqtbwCFRPxOgodr0gAAg'
br.addheaders = [('User-agent', 'Firefox')]
br.open(host)
br.form = list(br.forms())[0]
br.form['userName'] = username
br.form['pwd'] = password
submit = br.submit()
code = response.read()
soup = BeautifulSoup(code)