所以我的问题是我似乎无法弄清楚如何使用Java从链接获取生成的HTML页面。这是我正在使用的代码:
public class URLReader {
public static void main(String[] args) throws Exception {
URL oracle = new URL("http://www.whalesonggames.com/oldforums/printthread.php?t=7495&pp=20&page=1");
BufferedReader in = new BufferedReader(new InputStreamReader(oracle.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
我想要打印出来的是:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en" id="vbulletin_html">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<base href="http://www.whalesonggames.com/oldforums/" /><!--[if IE]></base><![endif]-->
<meta name="generator" content="vBulletin 4.2.2" />
<link rel="stylesheet" type="text/css" href="css.php?styleid=3&langid=1&d=1381351020&td=ltr&sheet=bbcode.css,popupmenu.css,printthread.css,vbulletin.css,vbulletin-chrome.css" />
<title> transfers</title>
<link rel="stylesheet" type="text/css" href="css.php?styleid=3&langid=1&d=1381351020&td=ltr&sheet=additional.css" />
</head>
<body>
<div class="above_body">
<div id="header" class="floatcontainer">
<div><a name="top" href="forum.php" class="logo-image"><img src="images/misc/vbulletin4_logo.png" alt="The Infinite Black Forums - Powered by vBulletin" /></a></div>
</div>
</div>
<div class="body_wrapper">
<div id="pagetitle">
<h1><a href="showthread.php?7495-transfers">transfers</a></h1>
<p class="description">Printable View</p>
</div>
<ul id="postlist">
<li class="postbit blockbody" id="post_1">
<div class="header">
<div class="datetime">04-10-2014, 06:59 AM</div>
<span class="username">CaNc3r</span>
</div>
<div class="title">transfers</div>
<div class="content">
<blockquote class="restore">just wondering if we get our garrisons transfered also now? thank you.</blockquote>
</div>
</li><li class="postbit blockbody" id="post_2">
<div class="header">
<div class="datetime">04-10-2014, 08:03 AM</div>
<span class="username">replicatorz</span>
</div>
<div class="content">
<blockquote class="restore">More at login says you can claim your grey corp with transfer.<br />
<br />
I am wondering what will happen now that I sold both sald corps in blue after claiming them on grey. I suppose for now I will leave them undeployed/empty.</blockquote>
</div>
</li><li class="postbit blockbody" id="post_3">
<div class="header">
<div class="datetime">04-10-2014, 08:07 AM</div>
<span class="username">scoutsniper</span>
</div>
<div class="content">
<blockquote class="restore">I'd like some clarification as well. When grey server opened GNG sent a lead at to grey to hold our spot. Since then we have tformed our red server garrison a full level. Does the mean our garrison on grey is 11 or 12?</blockquote>
</div>
</li><li class="postbit blockbody" id="post_4">
<div class="header">
<div class="datetime">04-10-2014, 08:09 AM</div>
<span class="username">CaNc3r</span>
</div>
<div class="content">
<blockquote class="restore">anyone having login issues after reset?</blockquote>
</div>
</li><li class="postbit blockbody" id="post_5">
<div class="header">
<div class="datetime">04-10-2014, 08:25 AM</div>
<span class="username">replicatorz</span>
</div>
<div class="content">
<blockquote class="restore">Never mind. I reread login screen. Question answered.</blockquote>
</div>
</li><li class="postbit blockbody" id="post_6">
<div class="header">
<div class="datetime">04-10-2014, 08:50 AM</div>
<span class="username">Ozymandias</span>
</div>
<div class="content">
<blockquote class="restore">If the original Feb 10th duplicate was PURGED (entirely deleted), or if it never exited (post Feb 10th), it was re-duplicated today.<br />
<br />
If it is being used on the new server, there was no re-duplication. It has always existed there.<br />
<br />
You can type :TRANSFER to see what corporation you would transfer into.</blockquote>
</div>
</li><li class="postbit blockbody" id="post_7">
<div class="header">
<div class="datetime">04-10-2014, 09:10 AM</div>
<span class="username">Kolpo</span>
</div>
<div class="content">
<blockquote class="restore">What if I tried to transfer a corp after feb 10th and it's dissapeared is there a way for me to get that back?</blockquote>
</div>
</li><li class="postbit blockbody" id="post_8">
<div class="header">
<div class="datetime">04-10-2014, 09:11 AM</div>
<span class="username">Ozymandias</span>
</div>
<div class="content">
<blockquote class="restore"><a href="http://www.whalesonggames.com/forums/showthread.php?7497-Red-Blue-Green-Corporations-copied-to-Grey" target="_blank">http://www.whalesonggames.com/forums...copied-to-Grey</a></blockquote>
</div>
</li><li class="postbit blockbody" id="post_9">
<div class="header">
<div class="datetime">04-10-2014, 09:12 AM</div>
<span class="username">Ozymandias</span>
</div>
<div class="content">
<blockquote class="restore"><div class="bbcode_container">
<div class="bbcode_description">Quote:</div>
<div class="bbcode_quote printable">
<hr />
<div>
Originally Posted by <strong>Kolpo</strong>
<a href="showthread.php?p=122005#post122005" rel="nofollow"><img class="inlineimg" src="images/buttons/viewpost.gif" alt="View Post" /></a>
</div>
<div class="message">What if I tried to transfer a corp after feb 10th and it's dissapeared is there a way for me to get that back?</div>
<hr />
</div>
</div>If it existed on the old servers still, it was duplicated today. Otherwise there's not much we can do.</blockquote>
</div>
</li>
</ul>
</div>
<div class="below_body">
<div id="footer_time" class="footer_time">All times are GMT -7. The time now is <span class="time">07:20 PM</span>.</div>
<div id="footer_copyright" class="footer_copyright">
<!-- Do not remove this copyright notice -->
Powered by <a href="https://www.vbulletin.com" id="vbulletinlink">vBulletin®</a> Version 4.2.2 <br />Copyright © 2014 vBulletin Solutions, Inc. All rights reserved.
<!-- Do not remove this copyright notice -->
</div>
<div id="footer_morecopyright" class="footer_morecopyright">
<!-- Do not remove cronimage or your scheduled tasks will cease to function -->
<!-- Do not remove cronimage or your scheduled tasks will cease to function -->
</div>
</div>
</body>
</html>
这是Google Chrome在我查看&gt;时所吐出的内容开发人员&gt;查看来源。 但是,当运行上面的Java代码时,我得到了这个代码:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en" id="vbulletin_html">
<head>
<meta charset="ISO-8859-1" />
<meta id="e_vb_meta_bburl" name="vb_meta_bburl" content="http://www.whalesonggames.com/oldforums" />
<base href="http://www.whalesonggames.com/oldforums/" />
<meta name="generator" content="vBulletin 4.2.2" />
<meta name="viewport" content="width=device-width, minimum-scale=1, maximum-scale=1">
<meta name="keywords" content="android,infinite black,mmo,whalesong" />
<meta name="description" content="Whalesong Games - Support, Wiki & Forums" />
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.4/jquery.min.js"></script>
<script type="text/javascript">
<!--
if (typeof jQuery === 'undefined') // Load jQuery Local
{
document.write('<script type="text/javascript" src="clientscript/jquery/jquery-1.6.4.min.js"><\/script>');
var remotejquery = false;
}
else // Load Rest of jquery remotely (where possible)
{
var remotejquery = true;
}
var SESSIONURL = "s=0f57ff6a3b879742a4f67d0cfea40613&";
var SECURITYTOKEN = "guest";
var IMGDIR_MISC = "images/misc";
var IMGDIR_BUTTON = "images/buttons";
var IMGDIR_MOBILE = "images/mobile";
var vb_disable_ajax = parseInt("0", 10);
var SIMPLEVERSION = "422";
var BBURL = "http://www.whalesonggames.com/oldforums";
var LOGGEDIN = 0 > 0 ? true : false;
var THIS_SCRIPT = "printthread";
var RELPATH = "printthread.php?t=7495&pp=20&page=1";
var USER_STYLEID = "1";
var MOBILE_STYLEID = "2";
var MOBILE_STYLEID_ADV = "2";
var USER_DEFAULT_STYLE_TYPE = "standard";
// -->
</script>
<script type="text/javascript" src="http://www.whalesonggames.com/oldforums/clientscript/vbulletin-mobile-init.js?v=422"></script>
<script type="text/javascript" src="http://www.whalesonggames.com/oldforums/clientscript/jquery/jquery.mobile-1.0.vb.js?v=422"></script>
<script type="text/javascript" src="http://www.whalesonggames.com/oldforums/clientscript/vbulletin-mobile.js?v=422"></script>
<link rel="stylesheet" href="clientscript/jquery/jquery.mobile-1.0.min.css?v=422" />
<link rel="stylesheet" type="text/css" href="css.php?styleid=2&langid=1&d=1381351020&td=ltr&sheet=bbcode.css,editor.css,popupmenu.css,reset-fonts.css,vbulletin.css,vbulletin-chrome.css,vbulletin-formcontrols.css," />
<title>The Infinite Black Forums</title>
</head>
<body>
<div data-role="page" data-theme="d" id="page-home">
<div id="header">
<div id="header-left">
<a href="forum.php?s=0f57ff6a3b879742a4f67d0cfea40613" class="logo-image" rel="external"><img src="images/mobile/vbulletin-logo.png" alt="The Infinite Black Forums - Powered by vBulletin" /></a>
</div>
<div id="header-right">
<a href="mobile.php?s=0f57ff6a3b879742a4f67d0cfea40613&do=login" class="headericon" rel="external"><img src="images/mobile/login.png" /></a>
<a href="mobile.php?s=0f57ff6a3b879742a4f67d0cfea40613&do=gridmenu" class="headericon"><img src="images/mobile/gridmenu.png" /></a>
<a href="search.php?s=0f57ff6a3b879742a4f67d0cfea40613&search_type=1&contenttype=vBForum_Post" class="headericon" rel="external"><img src="images/mobile/search.png" /></a> <a href="http://www.whalesonggames.com/community/tib/leaderboards/" class="headericon"><img src="images/mobile/merch.png" /></a>
<a href="https://www.theinfiniteblack.com/blackdollars/" class="headericon"><img src="images/mobile/bd.png" /></a>
</div>
</div>
<div id="pagetitle" class="pagetitle ui-bar-b">
<h1 class="pagetitle">vBulletin Message</h1>
</div>
<div data-role="content">
<div class="ui-body ui-body-e">We are sorry, this content is not supported via the mobile style. <br /><a href="forum.php?s=0f57ff6a3b879742a4f67d0cfea40613" rel="external">Click Here to go to the Forum Homepage</a>.</div>
</div>
<div id="footer">
<ul id="footer_links">
<li class="first"><a href="mobile.php?s=0f57ff6a3b879742a4f67d0cfea40613&do=login">Log in</a></li>
<li><a href="register.php?s=0f57ff6a3b879742a4f67d0cfea40613" rel="external">Register</a></li>
<li><a href="forum.php?styleid=1" class="fullsitelink" rel="external">Full Site</a></li>
<li class="last"><a href="#top" class="scrolltop" rel="external">Top</a></li>
</ul>
<div id="footer_copyright" class="shade footer_copyright">
<!-- Do not remove this copyright notice -->
Powered by <a href="https://www.vbulletin.com" id="vbulletinlink">vBulletin®</a> Version 4.2.2 <br />Copyright © 2014 vBulletin Solutions, Inc. All rights reserved.
<!-- Do not remove this copyright notice -->
</div>
<div id="footer_morecopyright" class="shade footer_morecopyright">
<!-- Do not remove cronimage or your scheduled tasks will cease to function -->
<img src="http://www.whalesonggames.com/oldforums/cron.php?s=0f57ff6a3b879742a4f67d0cfea40613&rand=1397183042" alt="" width="1" height="1" border="0" />
<!-- Do not remove cronimage or your scheduled tasks will cease to function -->
</div>
</div>
</div><!-- data-role="page" -->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-36823542-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</body>
</html>
这不是我想要的。现在,请记住,我几乎不了解网络语言及其工作原理,但我想我已经发现,当浏览器加载页面时,HTML的第二个片段“生成”第一个HTML片段。如果那是错的,请纠正我。无论如何,有没有办法在浏览器中向用户显示之前检索HTML的“最终版本”?
答案 0 :(得分:1)
看起来您尝试打开的网站无法识别默认用户代理。
尝试在URL对象构造之前添加类似的内容:
System.setProperty("http.agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0");