Jsoup.connect()。得不到正确的html

时间:2017-11-22 21:35:22

标签: java html jsoup

Document doc = Jsoup.connect("https://www.youtube.com/channel/UCAV_q5_FInIBBcOUgQSRHrA/videos").get();

FileWriter fl = new FileWriter(new File("html.txt"));
fl.write(doc.outerHtml());

html.txt的一部分:

<!doctype html>
<html invert style="font-size: 10px;font-family: Roboto, Arial, sans-serif; background-color: #fafafa;">
 <head>
  <!-- Origin Trial Token, feature = Long Task Observer, origin = https://www.youtube.com, expires = 2017-04-17 -->
  <meta http-equiv="origin-trial" data-feature="Long Task Observer" data-expires="2017-04-17" content="AgXf9faUpH8YmYNhInb5nw8BxXZaT8pZlj3At6FUrcvdBzs0I8VxKDkfinT4bbXfPZX8lXKfjotQZrhFVnpzFwYAAABZeyJvcmlnaW4iOiJodHRwczovL3d3dy55b3V0dWJlLmNvbTo0NDMiLCJmZWF0dXJlIjoiTG9uZ1Rhc2tPYnNlcnZlciIsImV4cGlyeSI6MTQ5MjQ3MzYwMH0=">
  <script>var ytcfg = {d: function() {return (window.yt && yt.config_) || ytcfg.data_ || (ytcfg.data_ = {});},get: function(k, o) {return (k in ytcfg.d()) ? ytcfg.d()[k] : o;},set: function() {var a = arguments;if (a.length > 1) {ytcfg.d()[a[0]] = a[1];} else {for (var k in a[0]) {ytcfg.d()[k] = a[0][k];}}}};window.ytcfg.set('EMERGENCY_BASE_URL', "\/error_204?t=jserror\u0026level=ERROR\u0026client.version=2.20171121\u0026client.name=1");</script>
  <link rel="shortcut icon" href="/yts/img/favicon-vfl8qSV2F.ico" type="image/x-icon">
  <link rel="icon" href="/yts/img/favicon_32-vflOogEID.png" sizes="32x32">
  <link rel="icon" href="/yts/img/favicon_48-vflVjB_Qk.png" sizes="48x48">
  <link rel="icon" href="/yts/img/favicon_96-vflW9Ec0w.png" sizes="96x96">
  <link rel="icon" href="/yts/img/favicon_144-vfliLAfaB.png" sizes="144x144">
  ...

它只写了整个HTML的一部分,它在第280-284行停止。如果我下载html并将其放在一个文件中它可以正常工作

0 个答案:

没有答案