我尝试通过访问以下网址从数字图书馆获取搜索结果:
这个网址从任何网页浏览器都可以正常工作,但是,当我尝试从我的java应用程序中读取此URL时,它会返回此html文件,这似乎将应用程序重定向到另一个页面:
<!-- filename: sso -->
<html>
<head>
<title>Login </title>
<!-- START filename: meta-tags.pds -->
<meta http-equiv="Cache-Control" content="no-cache" />
<meta http-equiv="Pragma" content="no-cache" />
<meta http-equiv="Expires" content="Sun, 06 Nov 1994 08:49:37 GMT" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<!-- END filename: meta-tags.pds -->
<link rel="stylesheet" href="http://monash-dc05.hosted.exlibrisgroup.com:8991/PDSMExlibris.css" type="text/css" />
</head>
<body onload="location = '/goto/http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=0&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1397132076758&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST';">
<noscript>
<div id="header">
<div>
<img src="http://monash-dc05.hosted.exlibrisgroup.com:8991//exlibris/primo/p4_1/pds/html_form/icon/exlibrislogo.jpg" alt="Exlibris Logo" />
<p> </p>
</div>
</div>
<div id="connect">
<a href="/goto/http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=0&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1397132076758&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST">Return from Check SSO </a>
</div>
</noscript>
</body>
</html>
我硬编码了我的应用程序重定向到的页面,代码很简单:
String url="http://search.lib.monash.edu:80/primo_library/libweb/action/login.do?afterPDS=true&vid=MON&vid=MON&dscnt=0&targetURL=http%3A%2F%2Fsearch.lib.monash.edu%2Fprimo_library%2Flibweb%2Faction%2Fsearch.do%3Fdscnt%3D0&frbg=&tab=default%5Ftab&dstmp=1397132076758&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=&vl%28freeText0%29=java&fn=search&pds_handle=GUEST";
Document d=Jsoup.connect(url).timeout(60000).get();
应用程序重定向到的页面(在body onload中定义)不可用。
我的问题是我如何使用我的java应用程序从上面的URL获取html文件,就像我从浏览器中获取它一样?
此数字图书馆没有API或任何公开的服务,否则我会使用它们。
答案 0 :(得分:0)
在最后一段代码中用&
替换&
(字符串url =&#34; ..)