我正在尝试自动登录到其登录表单具有以下HTML代码的网站(摘录):
<tr>
<td width="60%">
<input type="text" name="username" class="required black_text" maxlength="50" value="" />
</td>
<td>
<input type="password" name="password" id="password" class="required black_text" maxlength="50" value="" />
</td>
<td colspan="2" align="center">
<input type="image" src="gifs/login.jpg" name="Login2" value="Login" alt="Login" title="Login"/>
</td>
</tr>
我使用python的mechanize
模块进行网页浏览。以下是代码:
br.select_form(predicate=self.__form_with_fields("username", "password"))
br['username'] = self.config['COMMON.USER']
br['password'] = self.config['COMMON.PASSWORD']
try:
request = br.click(name='Login2', type='image')
response = mechanize.urlopen(request)
print response.read()
except IOError, err:
logger = logging.getLogger(__name__)
logger.error(str(err))
logger.debug(response.info())
print str(err)
sys.exit(1)
def __form_with_fields(self, *fields):
""" Generator of form predicate functions. """
def __pred(form):
for field_name in fields:
try:
form.find_control(field_name)
except ControlNotFoundError, err:
logger = logging.getLogger(__name__)
logger.error(str(err))
return False
return True
return __pred
不确定我做错了什么......
由于
答案 0 :(得分:1)
网站可能会使用java-script在登录期间进行回发。我记得很清楚,对于ASP .Net网站,你需要掌握HIDDEN FORM字段,如 VIEWSTATE 和 EVENTTARGET ,并将它们发布到新的页面。 为什么不发送问题中的网站链接?在此之后弄清楚变得相对容易
答案 1 :(得分:0)
from selenium import PhantomJS
import platform
if platform.system() == 'Windows': # .exe for Windows
PhantomJS_path = './phantomjs.exe'
else:
PhantomJS_path = './phantomjs'
service_args = [ # Proxy (optional)
'--proxy=<>',
'--proxy-type=http',
'--ignore-ssl-errors=true',
'--web-security=false'
]
browser = PhantomJS(PhantomJS_path, service_args=service_args)
browser.set_window_size(1280, 720) # Window size for screenshot (optional)
login_url = "<url_here>"
# Credentials
Username = "<insert>"
Password = "<insert>"
# Login
browser.get(login_url)
browser.save_screenshot('login.png')
print browser.current_url
browser.find_element_by_id("<username field id>").send_keys(Username)
browser.find_element_by_id("<password field id>").send_keys(Password)
browser.find_element_by_id("<login button id>").click()
print (browser.current_url)
browser.get(scrape_url)
print browser.page_source
browser.quit()