我与Web编程有点不同步,进入HTMLUnit比我想象的要混乱。
从本质上讲,我错过了注册课程的时间,因此在打开空间时需要通知我,但是在进入该页面转储之前,我需要提交一个带有两个广播输入的表格(选项为“ Spring Semester 2019”和“ All类”)。
我处在一个奇怪的地方,我想学习更多,但还需要一个有效的脚本,因此将答案和一些我可能没有使用的资源结合起来真是太棒了!例如,当我进入下一页时,如何下载html文件原始文件并访问所需的数据,例如xyz类中已填充的可用斑点数。
https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject
这是我编写的让猴子有点湿的猴子小程序:
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlRadioButtonInput;
import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
import java.io.IOException;
import java.net.MalformedURLException;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.Page;
import com.gargoylesoftware.htmlunit.WebClient;
public class hateMerced {
public void submittingForm() throws Exception {
}
public static void main(final String[] args) throws IOException {
final WebClient webClient = new WebClient();
// Get the first page
HtmlPage page1 = webClient.getPage("https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject");
// Get the form that we are dealing with and within that form,
// find the submit button and the field that we want to change.
final HtmlForm form = page1.getFormByName("xhwschedule.P_ViewSchedule");
HtmlRadioButtonInput radioButton = (HtmlRadioButtonInput) page1.getElementById("201910");
radioButton.setChecked(true);
HtmlRadioButtonInput radioButton2 = (HtmlRadioButtonInput) page1.getElementById("N");
radioButton2.setChecked(true);
final HtmlSubmitInput button = form.getInputByName("View Class Schedule");
// Now submit the form by clicking the button and get back the second page.
// final HtmlPage page2 = button.click();
webClient.close();
}
}
这是我得到的可爱错误:
Jan 16, 2019 1:09:57 AM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter error
SEVERE: error: message=[illegally formed XML syntax] sourceName=[script in https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject from (11, 54) to (39, 10)] line=[38] lineSource=[// End script hiding -->] lineOffset=[24]
Exception in thread "main" ======= EXCEPTION START ========
Exception class=[net.sourceforge.htmlunit.corejs.javascript.EvaluatorException]
com.gargoylesoftware.htmlunit.ScriptException: illegally formed XML syntax (script in https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject from (11, 54) to (39, 10)#38)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:892)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:616)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:534)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:723)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:689)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:735)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScript(HtmlPage.java:922)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:316)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:396)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:246)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:267)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:802)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:758)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1194)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1134)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:221)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:314)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3179)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2132)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:939)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:452)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:403)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1001)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:250)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:196)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:267)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:158)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:531)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:398)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:315)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:466)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:448)
at FuckMerced.main(FuckMerced.java:34)
Caused by: net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: illegally formed XML syntax (script in https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject from (11, 54) to (39, 10)#38)
at com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.error(StrictErrorReporter.java:65)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:260)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:232)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:228)
at net.sourceforge.htmlunit.corejs.javascript.TokenStream.getNextXMLToken(TokenStream.java:1287)
at net.sourceforge.htmlunit.corejs.javascript.TokenStream.getFirstXMLToken(TokenStream.java:1136)
at net.sourceforge.htmlunit.corejs.javascript.Parser.xmlInitializer(Parser.java:2666)
at net.sourceforge.htmlunit.corejs.javascript.Parser.unaryExpr(Parser.java:2641)
at net.sourceforge.htmlunit.corejs.javascript.Parser.mulExpr(Parser.java:2568)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addExpr(Parser.java:2552)
at net.sourceforge.htmlunit.corejs.javascript.Parser.shiftExpr(Parser.java:2533)
at net.sourceforge.htmlunit.corejs.javascript.Parser.relExpr(Parser.java:2508)
at net.sourceforge.htmlunit.corejs.javascript.Parser.eqExpr(Parser.java:2480)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitAndExpr(Parser.java:2469)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitXorExpr(Parser.java:2458)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitOrExpr(Parser.java:2447)
at net.sourceforge.htmlunit.corejs.javascript.Parser.andExpr(Parser.java:2436)
at net.sourceforge.htmlunit.corejs.javascript.Parser.orExpr(Parser.java:2425)
at net.sourceforge.htmlunit.corejs.javascript.Parser.condExpr(Parser.java:2389)
at net.sourceforge.htmlunit.corejs.javascript.Parser.assignExpr(Parser.java:2345)
at net.sourceforge.htmlunit.corejs.javascript.Parser.expr(Parser.java:2324)
at net.sourceforge.htmlunit.corejs.javascript.Parser.statementHelper(Parser.java:1282)
at net.sourceforge.htmlunit.corejs.javascript.Parser.statement(Parser.java:1136)
at net.sourceforge.htmlunit.corejs.javascript.Parser.parse(Parser.java:673)
at net.sourceforge.htmlunit.corejs.javascript.Parser.parse(Parser.java:594)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileImpl(Context.java:2601)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileString(Context.java:1583)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory$TimeoutContext.compileString(HtmlUnitContextFactory.java:216)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileString(Context.java:1572)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$1.doRun(JavaScriptEngine.java:714)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:877)
... 34 more
Enclosed exception:
net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: illegally formed XML syntax (script in https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject from (11, 54) to (39, 10)#38)
at com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.error(StrictErrorReporter.java:65)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:260)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:232)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addError(Parser.java:228)
at net.sourceforge.htmlunit.corejs.javascript.TokenStream.getNextXMLToken(TokenStream.java:1287)
at net.sourceforge.htmlunit.corejs.javascript.TokenStream.getFirstXMLToken(TokenStream.java:1136)
at net.sourceforge.htmlunit.corejs.javascript.Parser.xmlInitializer(Parser.java:2666)
at net.sourceforge.htmlunit.corejs.javascript.Parser.unaryExpr(Parser.java:2641)
at net.sourceforge.htmlunit.corejs.javascript.Parser.mulExpr(Parser.java:2568)
at net.sourceforge.htmlunit.corejs.javascript.Parser.addExpr(Parser.java:2552)
at net.sourceforge.htmlunit.corejs.javascript.Parser.shiftExpr(Parser.java:2533)
at net.sourceforge.htmlunit.corejs.javascript.Parser.relExpr(Parser.java:2508)
at net.sourceforge.htmlunit.corejs.javascript.Parser.eqExpr(Parser.java:2480)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitAndExpr(Parser.java:2469)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitXorExpr(Parser.java:2458)
at net.sourceforge.htmlunit.corejs.javascript.Parser.bitOrExpr(Parser.java:2447)
at net.sourceforge.htmlunit.corejs.javascript.Parser.andExpr(Parser.java:2436)
at net.sourceforge.htmlunit.corejs.javascript.Parser.orExpr(Parser.java:2425)
at net.sourceforge.htmlunit.corejs.javascript.Parser.condExpr(Parser.java:2389)
at net.sourceforge.htmlunit.corejs.javascript.Parser.assignExpr(Parser.java:2345)
at net.sourceforge.htmlunit.corejs.javascript.Parser.expr(Parser.java:2324)
at net.sourceforge.htmlunit.corejs.javascript.Parser.statementHelper(Parser.java:1282)
at net.sourceforge.htmlunit.corejs.javascript.Parser.statement(Parser.java:1136)
at net.sourceforge.htmlunit.corejs.javascript.Parser.parse(Parser.java:673)
at net.sourceforge.htmlunit.corejs.javascript.Parser.parse(Parser.java:594)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileImpl(Context.java:2601)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileString(Context.java:1583)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory$TimeoutContext.compileString(HtmlUnitContextFactory.java:216)
at net.sourceforge.htmlunit.corejs.javascript.Context.compileString(Context.java:1572)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$1.doRun(JavaScriptEngine.java:714)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:877)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:616)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:534)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:723)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.compile(JavaScriptEngine.java:689)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:735)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScript(HtmlPage.java:922)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:316)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:396)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:246)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:267)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:802)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:758)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1194)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1134)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:221)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:314)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3179)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2132)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:939)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:452)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:403)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:1001)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:250)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:196)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:267)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:158)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:531)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:398)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:315)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:466)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:448)
at FuckMerced.main(FuckMerced.java:34)
== CALLING JAVASCRIPT ==
<!-- Hide JavaScript from older browsers
var submitcount=0;
function checkSubmit() {
if (submitcount == 0)
{
submitcount++;
return true;
}
else
{
alert("Your changes have already been submitted.");
return false;
}
}
// End script hiding -->
<script type="text/javascript">
<!-- Hide JavaScript from older browsers
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-31337262-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
// End script hiding -->
======= EXCEPTION END ========
如果链接很奇怪,这是我要访问的表单的HTML:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML lang="en">
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
<META HTTP-EQUIV="Pragma" NAME="Cache-Control" CONTENT="no-cache">
<META HTTP-EQUIV="Cache-Control" NAME="Cache-Control" CONTENT="no-cache">
<LINK REL="stylesheet" HREF="/css/web_defaultapp.css" TYPE="text/css">
<LINK REL="stylesheet" HREF="/css/web_defaultprint.css" TYPE="text/css" media="print">
<TITLE>Search Courses by Subject</TITLE>
<META HTTP-EQUIV="Content-Script-Type" NAME="Default_Script_Language" CONTENT="text/javascript">
<SCRIPT LANGUAGE="JavaScript" TYPE="text/javascript">
<!-- Hide JavaScript from older browsers
var submitcount=0;
function checkSubmit() {
if (submitcount == 0)
{
submitcount++;
return true;
}
else
{
alert("Your changes have already been submitted.");
return false;
}
}
// End script hiding -->
<script type="text/javascript">
<!-- Hide JavaScript from older browsers
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-31337262-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
// End script hiding -->
</script>
</HEAD>
<BODY>
<DIV class="headerwrapperdiv">
<TABLE CLASS="plaintable" SUMMARY="This table displays Menu Items and Banner Search textbox."
WIDTH="100%">
<TR>
<TD CLASS="pldefault"></TD>
<TD CLASS="pldefault"><p class="rightaligntext"></p>
</TD></TR></TABLE>
</DIV>
<DIV class="pagetitlediv">
<TABLE CLASS="plaintable" SUMMARY="This table displays title and static header displays."
WIDTH="100%">
<TR>
<TD CLASS="pldefault"><br /><br /><br /></TD>
<TD CLASS="pldefault"> </TD>
<TD CLASS="pldefault"><p class="rightaligntext"></p>
<DIV class="staticheaders">
</div>
</TD></TR><TR>
<TD width="100%" colSpan=3> </TD>
</TR></TABLE>
</DIV>
<DIV class="pagebodydiv">
UC Merced Schedule--Search Courses by Term or Subject <H4>
Interested in UC Online courses offered at other UC campuses? Check out information at <a href="http://crossenroll.universityofcalifornia.edu/" target="_blank">UC Online</a>.
<FORM ACTION="xhwschedule.P_ViewSchedule" METHOD="post">
<TABLE CLASS="plaintable" >
<TR>
<TD COLSPAN="2" CLASS="pldefault">Select a Term:</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201820" CHECKED>
<TD CLASS="pldefault">Summer Semester 2018 - All Courses</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201820 - S6" CHECKED>
<TD CLASS="pldefault">Summer Semester 2018 - First 6-week Summer Session</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201820 - S62" CHECKED>
<TD CLASS="pldefault">Summer Semester 2018 - Second 6-week Summer Session</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201820 - S8" CHECKED>
<TD CLASS="pldefault">Summer Semester 2018 - 8-week Summer Session</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201830" CHECKED>
<TD CLASS="pldefault">Fall Semester 2018</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="validterm" VALUE="201910" CHECKED>
<TD CLASS="pldefault">Spring Semester 2019</TD>
</TR>
</SELECT>
<BR>
<TR>
<TD CLASS="pldefault"> </TD>
</TR>
<TR>
<TD CLASS="pldefault">Subject:</TD>
<TD CLASS="pldefault">
<SELECT NAME="subjcode">
<OPTION VALUE="ALL">All Subjects
<OPTION VALUE="ANTH">Anthropology
<OPTION VALUE="BEST">Bio Engin Small Scale Tech
<OPTION VALUE="BIOE">Bioengineering
<OPTION VALUE="BIO">Biological Sciences
<OPTION VALUE="CHEM">Chemistry
<OPTION VALUE="CCST">Chicano Chicana Studies
<OPTION VALUE="CHN">Chinese
<OPTION VALUE="COGS">Cognitive Science
<OPTION VALUE="CRS">Community Research and Service
<OPTION VALUE="CSE">Computer Science & Engineering
<OPTION VALUE="CORE">Core
<OPTION VALUE="CRES">Critical Race & Ethnic Studies
<OPTION VALUE="ESS">Earth Systems Science
<OPTION VALUE="ECON">Economics
<OPTION VALUE="EDUC">Education
<OPTION VALUE="EECS">Elect. Engr. & Comp. Sci.
<OPTION VALUE="ENGR">Engineering
<OPTION VALUE="ENG">English
<OPTION VALUE="ENVE">Environmental Engineering
<OPTION VALUE="ES">Environmental Systems (GR)
<OPTION VALUE="FRE">French
<OPTION VALUE="GEOG">Geography
<OPTION VALUE="GASP">Global Arts Studies Program
<OPTION VALUE="HIST">History
<OPTION VALUE="HBIO">Human Biology
<OPTION VALUE="IH">Interdisciplinary Humanities
<OPTION VALUE="JPN">Japanese
<OPTION VALUE="MGMT">Management
<OPTION VALUE="MBSE">Materials & BioMat Sci & Engr
<OPTION VALUE="MSE">Materials Science & Engr
<OPTION VALUE="MATH">Mathematics
<OPTION VALUE="ME">Mechanical Engineering
<OPTION VALUE="MIST">Mgmt of Innov, Sust, and Tech
<OPTION VALUE="NSUS">Nat Sciences Undergrad Studies
<OPTION VALUE="NSED">Natural Sciences Education
<OPTION VALUE="PHIL">Philosophy
<OPTION VALUE="PHYS">Physics
<OPTION VALUE="POLI">Political Science
<OPTION VALUE="PSY">Psychology
<OPTION VALUE="PH">Public Health
<OPTION VALUE="PUBP">Public Policy
<OPTION VALUE="QSB">Quantitative & Systems Biology
<OPTION VALUE="SCS">Social Sciences
<OPTION VALUE="SOC">Sociology
<OPTION VALUE="SPAN">Spanish
<OPTION VALUE="SPRK">Spark
<OPTION VALUE="USTU">Undergraduate Studies
<OPTION VALUE="WCH">World Cultures & History
<OPTION VALUE="WH">World Heritage
<OPTION VALUE="WRI">Writing
</SELECT>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="openclasses" VALUE="Y" CHECKED>
<TD CLASS="pldefault">Open Classes Only</TD>
</TR>
<TR>
<TD CLASS="pldefault">
<INPUT TYPE="radio" NAME="openclasses" VALUE="N">
<TD CLASS="pldefault">All Classes</TD>
</TR>
</TABLE>
<BR>
<BR>
<INPUT TYPE="submit" VALUE="View Class Schedule">
</FORM>
<!-- ** START OF twbkwbis.P_CloseDoc ** -->
<TABLE CLASS="plaintable" SUMMARY="This is table displays line separator at end of the page."
WIDTH="100%" cellSpacing=0 cellPadding=0 border=0><TR><TD class="bgtabon" width="100%" colSpan=2><IMG SRC="/wtlgifs/web_transparent.gif" ALT="Transparent Image" CLASS="headerImg" TITLE="Transparent Image" NAME="web_transparent" HSPACE=0 VSPACE=0 BORDER=0 HEIGHT=3 WIDTH=10></TD></TR></TABLE>
<A HREF="#top" onMouseover="window.status='Skip to top of page'; return true" onMouseout="window.status=''; return true" OnFocus="window.status='Skip to top of page'; return true" onBlur="window.status=''; return true" class="skiplinks">Skip to top of page</A>
</DIV>
<DIV class="footerbeforediv">
</DIV>
<DIV class="footerafterdiv">
</DIV>
<DIV class="globalafterdiv">
</DIV>
<DIV class="globalfooterdiv">
</DIV>
<DIV class="pagefooterdiv">
<SPAN class="releasetext">Release: 7.3 - Developed by UCM SIS</SPAN>
</DIV>
<DIV class="poweredbydiv">
</DIV>
<DIV class="div1"></DIV>
<DIV class="div2"></DIV>
<DIV class="div3"></DIV>
<DIV class="div4"></DIV>
<DIV class="div5"></DIV>
<DIV class="div6"></DIV>
<div class="banner_copyright"> <br><h5>© 2019 Ellucian Company L.P. and its affiliates.<br></h5></div>
</BODY>
</HTML>
很抱歉,这个冗长的问题,希望我有一天能将其还给社区:)
答案 0 :(得分:2)
我与Web编程和进入HTMLUnit有点不同步 比我想象的要混乱。
如果您现在想使网页自动化,则需要对Web技术(包括Html,Javascript和HTTP本身)有基本的了解,以便弄清楚该怎么做。
让我们从顶部开始-与您的
可爱的错误
首先,使用真实的浏览器打开页面,然后查看Web控制台。您将在此处看到相同的错误;这意味着您尝试自动执行的页面有一个错误(至少一个错误),并且您的浏览器只是忽略了此错误。 HtmlUnit被创建为测试工具;因此,对错误更挑剔。您必须禁用它。
webClient.getOptions().setThrowExceptionOnScriptError(false);
下一步: 您正在尝试访问页面上的表单
<FORM ACTION="xhwschedule.P_ViewSchedule" METHOD="post">
方法名称暗示'getFormByName()'能够找到具有正确名称属性的表单-但是您的表单没有表单。
下一步:
<INPUT TYPE="radio" NAME="validterm" VALUE="201910" CHECKED>
方法名称暗示'getElementById(“ 201910”)'能够找到具有正确id属性的元素-但您的单选按钮没有一个。
与按钮相同。 在下面,您可以找到完成此工作的快速技巧。至少阅读HtmlUnit - Getting Started with HtmlUnit页可能会有所帮助。还提供了Javadoc及其详细说明。
希望有帮助
public static void main(String[] args) throws IOException {
String url = "https://mystudentrecord.ucmerced.edu/pls/PROD/xhwschedule.p_selectsubject";
try (final WebClient webClient = new WebClient()) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
HtmlPage page = webClient.getPage(url);
webClient.waitForBackgroundJavaScript(1000);
page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage();
final HtmlForm form = page.getForms().get(0);
for (DomElement elem : form.getElementsByTagName("INPUT")) {
if (elem instanceof HtmlRadioButtonInput) {
HtmlRadioButtonInput radioButton = (HtmlRadioButtonInput) elem;
if ("201910".equals(radioButton.getValueAttribute())
|| "N".equals(radioButton.getValueAttribute())) {
radioButton.setChecked(true);
}
}
}
for (DomElement elem : form.getElementsByTagName("INPUT")) {
if (elem instanceof HtmlSubmitInput) {
if ("View Class Schedule".equals(elem.getAttribute("value"))) {
elem.click();
}
}
}
webClient.waitForBackgroundJavaScript(1000);
page = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage();
System.out.println("----------------");
System.out.println(page.asXml());
}
}