我有一些网站,我试图抓取其HTML代码。
网站看起来像这样:
==combo box==
==== requseted page ===
在组合框中选择选项时,页面内容会发生变化。 我想在重新加载完成后获得当前的html代码。
这就是组合框的定义方式:
<select name="ddlVersion" onchange="javascript:setTimeout('__doPostBack(\'ddlVersion\',\'\')', 0)" id="ddlVersion" style="height:29px;width:295px;">
<option selected="selected" value="1.htm"> 1 </option>
<option value="2.htm"> 2 </option>
<option value="3.htm"> 3 </option>
</select>
HTML代码中的__ doPostBack:
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['form1'];
if (!theForm) {
theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
这是我的代码:
WebBrowser web = new WebBrowser();
htmlCode = Scrapper.getHtmlCodeAsString();
web.Document.Write(htmlCode);
web.Refresh();
var element = web.Document.GetElementById("ddlVersion");
element.SetAttribute("selectedIndex", "2");
element.InvokeMember("onchange");
我也尝试过:
WebBrowser web = new WebBrowser();
web.Document.Write(htmlCode);
web.Refresh();
var element = web.Document.GetElementById("ddlVersion");
element.SetAttribute("selectedIndex", "2");
web.Document.InvokeScript("__doPostBack");