嘿我正在尝试登录网站并在登录后获取网页的html。并且无法弄清楚如何用python做到这一点。使用python 2.7。需要填写本网站上的html表格:
'user'='magaleast'和'password'='1181'(对我来说无用的真实登录详情)。然后,网站将用户重定向到认证页面,当它完成后,它将转到我需要的页面。
有什么想法吗?
编辑: 尝试这段代码:
from mechanize import Browser
import cookielib
br = Browser()
br.open("http://www.shiftorganizer.com/")
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
# You need to spot the name of the form in source code
br.select_form(name = "user")
# Spot the name of the inputs of the form that you want to fill,
# say "username" and "password"
br.form["user"] = "magaleast"
br.form["password"] = "1181"
response = br.submit()
print response.read()
但我明白了:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>ShiftOrganizer סידור עבודה בפחות משניה</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<script type="text/javascript">
var emptyCompany=1
function subIfNewApp()
{
if (emptyCompany){
document.authenticationForm.action = document.getElementById('userName').value + "/authentication.asp"
} else {
document.authenticationForm.action = document.getElementById('Company').value + "/authentication.asp"
}
document.authenticationForm.submit()
}
</script>
</head>
<body onload="subIfNewApp()">
<form name="authenticationForm" method="post" action="">
<input type="hidden" name="userName" id="userName" value="magaleast" />
<input type="hidden" name="password" id="password" value="1181" />
<input type="hidden" name="Company" id="Company" value="שם חברה" />
</form>
</body>
</html>
是问题吗?因为它再次在认证部分停止..?
答案 0 :(得分:0)
网站似乎确实需要一些JS,所以下面的代码是不够的。在这种特殊情况下,通过查看源代码,似乎最后使用了这个URL:
http://shifto.shiftorganizer.com/magaleast/welcome.asp?password=1181 这似乎包含登录后页面的类似信息(尽管我不能读希伯来语,我可能完全错了......)。如果是这样,你可以简单地做:
import urllib
url = 'http://shifto.shiftorganizer.com/*username*/welcome.asp?password=*password*'
print urllib.urlopen(url).read()
有关信息,请登录不需要Javascript的表单的代码。
我会使用mechanize库(还有Requests会工作),做类似
的事情from mechanize import Browser
br = Browser()
br.set_cookiejar(cookielib.LWPCookieJar())
# Browser options
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.open("your url")
# You need to spot the name of the form in source code
br.select_form(name="form_name")
# Spot the name of the inputs of the form that you want to fill,
# say "username" and "password"
br.form["username"] = "magaleast"
br.form["password"] = "1181"
response = br.submit()
print response.read()