我正在尝试使用jsoup从网站解析一些文本,但不幸的是<div>
没有类名。我只是在学习jsoup而且我不知道jsoup的哪个函数会帮助我解析来自<div>
的文本。
示例:
<div>
....
...
.....
</div>
现在我只能使用classname
从<div>
获取文本
代码:
document= Jsoup.connect(url).get();
Elements element = document.select("div[class=pandora]");
openBox = element.text();
来自jsoup.org
的HTML:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Lyrics to "Nuh Ready Nuh Ready" song by Calvin Harris: Mi and di mandem We haffi run from half of di gyal dem So sweet, so sweet Don't want mi children and...">
<meta name="keywords" content="Nuh Ready Nuh Ready lyrics, Calvin Harris Nuh Ready Nuh Ready lyrics, Calvin Harris lyrics">
<meta name="robots" content="noarchive">
<meta property="og:image" content="//www.azlyrics.com/az_logo_tr.png">
<title>Calvin Harris Lyrics - Nuh Ready Nuh Ready</title>
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/css/bootstrap.min.css">
<link rel="stylesheet" href="//www.azlyrics.com/bsaz.css">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<script type="text/javascript">
ArtistName = "Calvin Harris";
SongName = "Nuh Ready Nuh Ready";
function submitCorrections(){
document.getElementById('corlyr').submit();
return false;
}
</script>
</head>
<body>
<!-- Begin comScore Tag -->
<script>
var _comscore = _comscore || [];
_comscore.push({ c1: "2", c2: "6772046" });
(function() {
var s = document.createElement("script"), el = document.getElementsByTagName("script")[0]; s.async = true;
s.src = (document.location.protocol == "https:" ? "https://sb" : "http://b") + ".scorecardresearch.com/beacon.js";
el.parentNode.insertBefore(s, el);
})();
</script>
<noscript>
<img src="https://sb.scorecardresearch.com/p?c1=2&c2=6772046&cv=2.0&cj=1" alt="">
</noscript>
<!-- End comScore Tag -->
<div id="fb-root"></div>
<script>(function(d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s); js.id = id;
js.src = "//connect.facebook.net/en_US/sdk.js#xfbml=1&version=v2.3";
fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));</script>
<nav class="navbar navbar-default navbar-static-top noprint">
<div class="container">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#search-collapse">
<span class="glyphicon glyphicon-search"></span>
</button>
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#artists-collapse">
<span class="glyphicon glyphicon-th-list"></span>
</button>
<a class="navbar-brand" href="//www.azlyrics.com"><img alt="AZLyrics.com" class="pull-left" style="max-height:40px; margin-top:-10px;" src="//www.azlyrics.com/az_logo_tr.png"></a>
</div>
<ul class="collapse navbar-collapse nav navbar-nav" id="artists-collapse">
<li>
<div class="btn-group text-center" role="group">
<a class="btn btn-menu" href="//www.azlyrics.com/a.html">A</a>
<a class="btn btn-menu" href="//www.azlyrics.com/b.html">B</a>
<a class="btn btn-menu" href="//www.azlyrics.com/c.html">C</a>
<a class="btn btn-menu" href="//www.azlyrics.com/d.html">D</a>
<a class="btn btn-menu" href="//www.azlyrics.com/e.html">E</a>
<a class="btn btn-menu" href="//www.azlyrics.com/f.html">F</a>
<a class="btn btn-menu" href="//www.azlyrics.com/g.html">G</a>
<a class="btn btn-menu" href="//www.azlyrics.com/h.html">H</a>
<a class="btn btn-menu" href="//www.azlyrics.com/i.html">I</a>
<a class="btn btn-menu" href="//www.azlyrics.com/j.html">J</a>
<a class="btn btn-menu" href="//www.azlyrics.com/k.html">K</a>
<a class="btn btn-menu" href="//www.azlyrics.com/l.html">L</a>
<a class="btn btn-menu" href="//www.azlyrics.com/m.html">M</a>
<a class="btn btn-menu" href="//www.azlyrics.com/n.html">N</a>
<a class="btn btn-menu" href="//www.azlyrics.com/o.html">O</a>
<a class="btn btn-menu" href="//www.azlyrics.com/p.html">P</a>
<a class="btn btn-menu" href="//www.azlyrics.com/q.html">Q</a>
<a class="btn btn-menu" href="//www.azlyrics.com/r.html">R</a>
<a class="btn btn-menu" href="//www.azlyrics.com/s.html">S</a>
<a class="btn btn-menu" href="//www.azlyrics.com/t.html">T</a>
<a class="btn btn-menu" href="//www.azlyrics.com/u.html">U</a>
<a class="btn btn-menu" href="//www.azlyrics.com/v.html">V</a>
<a class="btn btn-menu" href="//www.azlyrics.com/w.html">W</a>
<a class="btn btn-menu" href="//www.azlyrics.com/x.html">X</a>
<a class="btn btn-menu" href="//www.azlyrics.com/y.html">Y</a>
<a class="btn btn-menu" href="//www.azlyrics.com/z.html">Z</a>
<a class="btn btn-menu" href="//www.azlyrics.com/19.html">#</a>
</div>
</li>
</ul>
<div class="collapse navbar-collapse" id="search-collapse">
<form class="navbar-form navbar-right search" method="get" action="//search.azlyrics.com/search.php" role="search">
<div class="input-group">
<input type="text" class="form-control" placeholder="" name="q" id="q">
<span class="input-group-btn">
<button class="btn btn-primary" type="submit"><span class="glyphicon glyphicon-search"></span> Search</button>
</span>
</div>
</form>
</div><!-- /.navbar-collapse -->
</div><!-- /.container -->
</nav>
<!-- top ban -->
<div class="lboard-wrap noprint">
<div class="container">
<div class="row">
<div class="col-xs-12 top-ad text-center">
<span id="cf_banner_top_nofc"></span>
</div>
</div>
</div>
</div>
<!-- main -->
<div class="container main-page">
<div class="row">
<div class="col-lg-2 text-center hidden-md hidden-sm hidden-xs noprint">
<div class="sky-ad"></div>
</div>
<!-- content -->
<div class="col-xs-12 col-lg-8 text-center">
<div class="div-share noprint">
<div class="fb-like" style="float:left;" data-href="https://www.azlyrics.com/lyrics/calvinharris/nuhreadynuhready.html" data-layout="button_count" data-action="like" data-show-faces="false" data-share="false"></div>
<!-- AddThis Button BEGIN -->
<script type="text/javascript" src="https://s7.addthis.com/js/300/addthis_widget.js#username=azlyrics"></script>
<div class="addthis_toolbox addthis_default_style" style="float:right;">
<a class="btn btn-xs btn-share addthis_button_email">
<span class="playblk"><img src="//www.azlyrics.com/images/email.svg" width="56" height="18" class="playblk" alt="Email"></span>
</a>
<a class="btn btn-xs btn-share addthis_button_print" style="margin-right: 0px !important;">
<span class="playblk"><img src="//www.azlyrics.com/images/print.svg" width="56" height="18" class="playblk" alt="Print"></span>
</a>
</div>
</div>
<!-- AddThis Button END -->
<div class="div-share"><h1>"Nuh Ready Nuh Ready" lyrics</h1></div>
<div class="lyricsh">
<h2><b>Calvin Harris Lyrics</b></h2>
</div>
<div class="ringtone">
<span id="cf_text_top"></span>
</div>
<b>"Nuh Ready Nuh Ready"</b><br>
<span class="feat">(feat. PARTYNEXTDOOR)</span><br>
<br>
<div>
<!-- Usage of azlyrics.com content by any third-party lyrics provider is prohibited by our licensing agreement. Sorry about that. -->
Mi and di mandem<br>
We haffi run from half of di gyal dem<br>
So sweet, so sweet<br>
Don't want mi children and ting'<br>
Mi nuh ready fi all dem tings<br>
So sweet, you're so sweet, yeah<br>
Yeah, mi nuh ready fi all dem things yet<br>
So sweet, so sweet, yeah<br>
Yeah, I'm not ready fi all dem tings yet<br>
I'm not ready fi all dem tings yet<br>
<br>
She call me kid, kid, kid<br>
My mama kiss her kid<br>
She say mi tooth-tooth sweet<br>
She say mi tooth-tooth sweet<br>
Don't make me feel like I love you<br>
Just 'cause I thought you was special<br>
Won't make me feel like I love you<br>
Baby, girl, I won't settle<br>
I had dreams of fuckin' the baddest bitch<br>
Last night I awoke up and I fucked the baddest bitch<br>
I thought I would be ready when I seen her<br>
When I was in the disco<br>
I gotta keep it honest<br>
Keep it real with you<br>
<br>
Mi and di mandem<br>
We haffi run from half of di gyal dem<br>
So sweet, so sweet<br>
Don't want mi children and tings<br>
Mi nuh ready fi all dem tings<br>
So sweet, you're so sweet<br>
Mi nuh ready fi all dem tings yet<br>
So sweet, so sweet<br>
Mi and di mandem<br>
We haffi run from half of di gyal dem<br>
So sweet, you're so sweet<br>
Don't want mi children and tings<br>
Mi nuh ready fi all dem tings<br>
So sweet, you're so sweet<br>
Mi nuh ready fi all dem tings<br>
So sweet, so sweet<br>
<br>
I strapped up 'cause they mapped up<br>
'Cause I need to know where you are<br>
Can't keep following these signs<br>
'Cause you're lookin' for a sign, and I can't give you one<br>
Start to feel like it's mad love<br>
That's givin' your attraction, to me<br>
Yeah, I just want you, nobody else, baby<br>
I don't wanna get too far<br>
It's just you that I want<br>
<br>
When it's mi and di mandem<br>
We haffi run from half of di gyal dem<br>
So sweet, so sweet<br>
Don't want mi children and tings<br>
Mi nuh ready fi all dem tings<br>
So sweet, you're so sweet<br>
Mi nuh ready fi all dem tings yet<br>
So sweet, so sweet<br>
Mi and di mandem<br>
We haffi run from half of di gyal dem<br>
So sweet, so sweet<br>
Don't want mi children and tings<br>
Mi nuh ready fi all dem tings<br>
So sweet, you're so sweet<br>
Mi nuh ready fi all dem tings
</div>
<br><br>
<!-- MxM banner -->
<div class="noprint">
<script>
if ( /Android|webOS|iPhone|iPod|iPad|BlackBerry|IEMobile|Opera Mini/i.test(navigator.userAgent) )
{
document.write('<div style="margin-left: auto; margin-right: auto;">'+
'<iframe scrolling="no" style="border: 0px none; overflow:hidden;" src="//adv.mxmcdn.net/br/t1.0/m_js/e_0/sn_0/l_17494554/su_0/tr_3vUCAOZlq_zEKGGqiwqgUipktnY4AJ8vdMlDERwd-IQW1fCzlbIik50-scymuRv_pi3wUAIxUI2AiwodRggYSWyWKe5520YE8tdDBkiBtPeafB1eU4jsrx-cHUKKrQnbpH1kEJ6cxCXNRK21S-URGe9hKl3IVQsjUfAjAGzo670kV-_NZoBHp8gEZ5eOQESUhj_qd_IMSEvXm2euf-p8Ih6vduevXpBlMcIEAKI3kCxKguw10zJEFpaF8yFsaYWxPJ04Xubjxi6nlSUBsg_Tr8m9oMC4dgrbSjSYIrAWyJz1IIVbLSkQUGxPFTsbNsL_-bnudnLQaUE_eaP3nAsOaQdHURbAr7wki_hHoAjXgZpE4VF7MLao4sJEJ4jJaHu9IhQphsYTZfU6HCHDQhcz3lF_zned3kiL-MhHIP8j0K_ktF3poJHjI5u9L-cJHNywsz-sadxqsZMdqBf1jMraRS68zUYcTR9L15oyvk54l_erv80gD-ns/" width="290px" height="50px"></iframe>'+
'</div>');
}
</script>
<br><br>
</div>
<form id="addsong" style="visible:hidden; margin:0;" action="../../add.php" method="post">
<input type="hidden" name="what" value="add_song">
<input type="hidden" name="artist" value="Calvin Harris">
</form>
<form action="../../add.php" method="post" id="corlyr">
<input type="hidden" name="what" value="correct_lyrics">
<input type="hidden" name="song_id" value="613870">
</form>
<div class="smt noprint">
<a class="btn btn-share" href="#" onclick="submitCorrections()"><span class="glyphicon glyphicon-pencil"></span> Submit Corrections</a>
</div>
<div class="smt"></div>
<div class="noprint" style="padding: 15px 0">
<span id="cf_text_bottom"></span>
</div>
<!-- credits -->
<div class="smt"></div>
<!-- song facts -->
<!-- artist link -->
<ol class="breadcrumb noprint" itemscope itemtype="https://schema.org/BreadcrumbList">
<li itemprop="itemListElement" itemscope itemtype="https://schema.org/ListItem"><a itemprop="item" href="//www.azlyrics.com"><span itemprop="name">AZLyrics</span></a></li>
<li itemprop="itemListElement" itemscope itemtype="https://schema.org/ListItem"><a itemprop="item" href="//www.azlyrics.com/c.html"><span itemprop="name">C</span></a></li>
<li itemprop="itemListElement" itemscope itemtype="https://schema.org/ListItem"><a itemprop="item" href="//www.azlyrics.com/c/calvinharris.html"><span itemprop="name">Calvin Harris Lyrics</span></a></li>
</ol>
<!-- album songlists -->
<!-- album songlists end -->
<form class="search noprint" method="get" action="//search.azlyrics.com/search.php" role="search">
<div style="margin-bottom:15px" class="input-group">
<input type="text" class="form-control" placeholder="" name="q">
<span class="input-group-btn">
<button class="btn btn-primary" type="submit"><span class="glyphicon glyphicon-search"></span> Search</button>
</span>
</div>
</form>
<div class="noprint visible-xs-block" style="margin-top:5px;margin-bottom:5px">
<span id="cf_rect_bottom"></span>
</div>
</div> <!-- content -->
<div class="col-lg-2 text-center hidden-md hidden-sm hidden-xs noprint">
<div class="sky-ad"></div>
</div>
</div>
</div> <!-- container main-page -->
<!-- nav bottom -->
<nav class="navbar navbar-default navbar-bottom">
<div class="container text-center">
<ul class="nav navbar-nav navbar-center">
<li><a href="//www.azlyrics.com/add.php" onclick="document.forms['addsong'].submit();return false;">Submit Lyrics</a></li>
<li><a href="//www.stlyrics.com">Soundtracks</a></li>
<li><a href="//www.facebook.com/pages/AZLyricscom/154139197951223">Facebook</a></li>
<li><a href="//www.azlyrics.com/contact.html">Contact Us</a></li>
</ul>
</div>
</nav>
<!-- bot ban -->
<div class="lboard-wrap noprint">
<div class="container">
<div class="row">
<div class="col-xs-12 top-ad text-center">
<span id="cf_banner_bottom"></span>
</div>
</div>
</div>
</div>
<!-- footer -->
<nav class="navbar navbar-footer noprint">
<div class="container text-center">
<ul class="nav navbar-nav navbar-center">
<li><a href="//www.azlyrics.com/adv.html">Advertise Here</a></li>
<li><a href="//www.azlyrics.com/privacy.html">Privacy Policy</a></li>
<li><a href="//www.azlyrics.com/copyright.html">DMCA Policy</a></li>
</ul>
</div>
</nav>
<div class="footer-wrap">
<div class="container">
<div class="noprint"><span style="font-weight:bold;line-height:54px;vertical-align:top;">Powered by </span><img src="//www.azlyrics.com/images/mxm.png" width="184" height="54" alt="MusixMatch"></div>
<small>
Calvin Harris lyrics are property and copyright of their owners. "Nuh Ready Nuh Ready" lyrics provided for educational purposes and personal use only.<br>
<script type="text/javascript">
curdate=new Date();
document.write("<strong>Copyright © 2000-"+curdate.getFullYear()+" AZLyrics.com<\/strong>");
</script>
</small>
</div>
</div>
<script>
cf_page_artist = ArtistName;
cf_page_song = SongName;
cf_page_genre = "pop";
</script>
<script src="//cdn.clickfuse.com/publishers/azlyrics/single.min.js"></script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-4309237-1']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
<div id="CssFailCheck" class="hidden" style="height:1px;"></div>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.2/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="//www.azlyrics.com/local/jquery.min.js"><\/script>')</script>
<script>
$(function () {
if ($('#CssFailCheck').is(':visible') === true) {
$('<link rel="stylesheet" type="text/css" href="//www.azlyrics.com/bs/css/bootstrap.min.css"><link rel="stylesheet" href="//www.azlyrics.com/bsaz.css">').appendTo('head');
}
});
</script>
<script src="//www.azlyrics.com/collapse.js"></script>
<script type="text/javascript" src="https://tracking.musixmatch.com/t1.0/m_js/e_0/sn_0/l_17494554/su_0/tr_3vUCAOZlq_zEKGGqiwqgUipktnY4AJ8vdMlDERwd-IQW1fCzlbIik50-scymuRv_pi3wUAIxUI2AiwodRggYSWyWKe5520YE8tdDBkiBtPeafB1eU4jsrx-cHUKKrQnbpH1kEJ6cxCXNRK21S-URGe9hKl3IVQsjUfAjAGzo670kV-_NZoBHp8gEZ5eOQESUhj_qd_IMSEvXm2euf-p8Ih6vduevXpBlMcIEAKI3kCxKguw10zJEFpaF8yFsaYWxPJ04Xubjxi6nlSUBsg_Tr8m9oMC4dgrbSjSYIrAWyJz1IIVbLSkQUGxPFTsbNsL_-bnudnLQaUE_eaP3nAsOaQdHURbAr7wki_hHoAjXgZpE4VF7MLao4sJEJ4jJaHu9IhQphsYTZfU6HCHDQhcz3lF_zned3kiL-MhHIP8j0K_ktF3poJHjI5u9L-cJHNywsz-sadxqsZMdqBf1jMraRS68zUYcTR9L15oyvk54l_erv80gD-ns/"></script>
</body>
</html>
我应该做些什么改变来实现上述目标?感谢
答案 0 :(得分:1)
以下代码可以为您提供所需格式的歌词:
// Get the lyrics div element
Element lyricsDiv = document.select("div.main-page > div.row > div.col-xs-12").select("div").get(7);
// Get the html of the element and replace <br> and comments
String lyrics = lyricsDiv.html().replaceAll("<br>", "").replaceAll("<!--(.*?)-->", "");
答案 1 :(得分:0)
试试这个
Elements main = doc.select("div[class=container main-page]");
Elements row = main.select("div[class=row]");
Elements col = row.select("div[class=col-xs-12 col-lg-8 text-center]");
songMetaDataTextView.setText(Html.fromHtml(col.select("div").get(7).toString());
您有嵌套标签
<div class="container main-page">
<div class="row">
<div class="col-lg-2 text-center hidden-md hidden-sm hidden-xs noprint">
<div class="sky-ad"></div>
</div>
<!-- content -->
<div class="col-xs-12 col-lg-8 text-center">
<div class="div-share noprint">
<div class="fb-like" style="float:left;" data-href="https://www.azlyrics.com/lyrics/calvinharris/nuhreadynuhready.html" data-layout="button_count" data-action="like" data-show-faces="false" data-share="false"></div>
<!-- AddThis Button BEGIN -->
<script type="text/javascript" src="https://s7.addthis.com/js/300/addthis_widget.js#username=azlyrics"></script>
<div class="addthis_toolbox addthis_default_style" style="float:right;">
<a class="btn btn-xs btn-share addthis_button_email">
<span class="playblk"><img src="//www.azlyrics.com/images/email.svg" width="56" height="18" class="playblk" alt="Email"></span>
</a>
<a class="btn btn-xs btn-share addthis_button_print" style="margin-right: 0px !important;">
<span class="playblk"><img src="//www.azlyrics.com/images/print.svg" width="56" height="18" class="playblk" alt="Print"></span>
</a>
</div>
</div>
<!-- AddThis Button END -->
<div class="div-share"><h1>"Nuh Ready Nuh Ready" lyrics</h1></div>
<div class="lyricsh">
<h2><b>Calvin Harris Lyrics</b></h2>
</div>
<div class="ringtone">
<span id="cf_text_top"></span>
</div>
<b>"Nuh Ready Nuh Ready"</b><br>
<span class="feat">(feat. PARTYNEXTDOOR)</span><br>
<br>
<div>
<!- your lyrics her -->
首先你得到容器主页然后行,然后是col-xs-12 col-lg-8文本中心,然后最后使用索引7获取文本