我正在使用Mechanize
抓取网页,我遇到了window.open
的问题
页面有如下所示的HTML:
<form name="form1" method="post" action="index.asp">
<a href="javascript:yoyaku('0','main','https://example.com/xyz/')" class= "col">
<span class="font-M1b">Click link</span>
</a>
</form>
mechanize
中的页面对象,如:
[1] pry(#<Crawler::Inzai>)> agent.page
=> #<Mechanize::Page
...
{links
#<Mechanize::Page::Link "Click link" "javascript:yoyaku('0','main','https://example.com/xyz/')">
{forms
#<Mechanize::Form
{name "form1"}
{method "POST"}
{action "index.asp"}
{fields [hidden:0x4399a00 type: hidden name: MENULINK value: ] [hidden:0x4399870 type: hidden name: UPFLG value: ]}
>}>
javascript:yoyaku
的功能如下:
function yoyaku(wkobj,wkvalue,wklnk){
var wkurl = '';
// jump to here
if (wkvalue != '' && document.links[wkobj].disabled != true){
for (cnt = 0; cnt < document.links.length; cnt++){
document.links[cnt].disabled = true;
}
// return 'https://example.com/xyz/logon.asp?LINK=showmenu'
wkurl = wklnk + 'logon.asp?LINK=showmenu';
window.open(wkurl,'SisetuReserveIac','location=no,menubar=no,status=no,titlebar=no,toolbar=no,scrollbars=yes,resizable=yes,width=1024,height='+(screen.availHeight)+',top=0');
for (cnt = 0; cnt < document.links.length; cnt++){
document.links[cnt].disabled = false;
}
}
return;
}
我希望当我点击上面的Click link
时,我会在新窗口中看到一个包含2帧的页面:
<frame src="menu.asp" name="menu" frameborder="0" noresize="">
...
<frame>
<frame src="top.asp" name="contents" frameborder="0" noresize="">
<form name="form1" method="post" action="top.asp">
...
</form>
<frame>
目前,我使用mechanize
获取javascript
功能中的网址,例如:
agent.get("https://example.com/xyz/logon.asp?LINK=showmenu")
但是,我得到的页面只有1帧,如:
[1] pry(#<Crawler::Inzai>)> agent.page
=> #<Mechanize::Page
...
{frames #<Mechanize::Page::Frame "menu" "menu.asp"> #<Mechanize::Page::Frame "contents" "top.asp">}>
你能帮忙吗?
我可以在这种情况下使用Mechanize
吗?
非常感谢!