是否可以让机械化遵循javascript类型的锚链接?
我正在尝试使用mechanize和beautifulsoup登录python中的网站。
这是锚链接
<a id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("StaticModuleID15$ctl00$SkinLogin1$Login1$Login1$LoginButton", "", true, "Login1", "", false, true))"><img id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_Image2" border="0" src="../../App_Themes/default/images/Member/btn_loginenter.gif" align="absmiddle" style="border-width:0px;" /></a>
这是我试过的
links = SoupStrainer('a', id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton")
[anchor for anchor in BeautifulSoup(data, parseOnlyThese=links)]
link = mechanize.Link( base_url = self.url,
url = str(anchor['href']),
text = str(anchor.string),
tag = str(anchor.name),
attrs = [(str(name), str(value))
for name, value in anchor.attrs])
response2 = br.follow_link(link)
现在我收到错误消息
urllib2.URLError:
感谢任何帮助或建议
修改
在帮助者的评论之后,我去看了一下asp页面的代码。
我找到了一些有用的脚本,但我不确定我在python中要做什么来使用python模拟JS代码。 在没有我看到任何饼干设置的地方,我看错了地方吗?
<form name="form1" method="post" action="BrowseSchedule.aspx?ItemId=75" onsubmit="javascript:return WebForm_OnSubmit();" id="form1">
//<![CDATA[
function WebForm_OnSubmit() {
if (typeof(ValidatorOnSubmit) == "function" && ValidatorOnSubmit() == false) return false;
return true;
}
//]]>
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['form1'];
if (!theForm) {
theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
function WebForm_DoPostBackWithOptions(options) {
var validationResult = true;
if (options.validation) {
if (typeof(Page_ClientValidate) == 'function') {
validationResult = Page_ClientValidate(options.validationGroup);
}
}
if (validationResult) {
if ((typeof(options.actionUrl) != "undefined") && (options.actionUrl != null) && (options.actionUrl.length > 0)) {
theForm.action = options.actionUrl;
}
if (options.trackFocus) {
var lastFocus = theForm.elements["__LASTFOCUS"];
if ((typeof(lastFocus) != "undefined") && (lastFocus != null)) {
if (typeof(document.activeElement) == "undefined") {
lastFocus.value = options.eventTarget;
}
else {
var active = document.activeElement;
if ((typeof(active) != "undefined") && (active != null)) {
if ((typeof(active.id) != "undefined") && (active.id != null) && (active.id.length > 0)) {
lastFocus.value = active.id;
}
else if (typeof(active.name) != "undefined") {
lastFocus.value = active.name;
}
}
}
}
}
}
if (options.clientSubmit) {
__doPostBack(options.eventTarget, options.eventArgument);
}
}
答案 0 :(得分:4)
我认为机械化模块不可能实现这一点:它无法与JavaScript交互:它纯粹基于Python和HTTP。
那就是说,你可能会遇到python-spidermonkey模块,它似乎旨在让你做这种事情。根据它的网站,它的目标是让你
“从Python执行任意JavaScript代码。允许您在JavaScript VM中引用任意Python对象和函数”
我还没有使用过它,但它看起来确实会像你想要的那样,尽管它仍处于alpha状态。
答案 1 :(得分:0)
您可以使用cookielib设置Cookie
import mechanize
import cookielib
# add headers to your browser also
browser = mechanize.Browser()
browser.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
cj = cookielib.LWPCookieJar()
browser.set_cookiejar(cj)
我怀疑这现在甚至是相关的,但是哦,好吧:)。