我试图从此链接中提取一些数据: https://myanimelist.net/anime/season
我特别想要的是每个链接图像文本。本节中的IE:
<div class="seasonal-anime js-seasonal-anime"
data-genre="7,42,37"><div>
<div class="title"><a href="https://myanimelist.net/anime/32867/Bungou_Stray_Dogs_2nd_Season/video" class="icon-watch-pv fl-r" title="Watch Promotional Video">Watch Promotional Video</a><p class="title-text">
<a href="https://myanimelist.net/anime/32867/Bungou_Stray_Dogs_2nd_Season" class="link-title">Bungou Stray Dogs 2nd Season</a>
</p>
</div>
<div class="prodsrc">
<span class="producer"><a href="/anime/producer/4/Bones" title="Bones">Bones</a></span>
<div class="eps">
<span id="32867" data-eps="12" class="fl-l icon-add-episode js-btn-add-episode" title="Click to increase your watched ep number by one"></span> <a href="https://myanimelist.net/anime/32867/Bungou_Stray_Dogs_2nd_Season/episode"><span class="js-episode-num">6</span>/<span>12 eps</span>
</a>
</div>
<span class="source">Manga</span>
<a href="https://myanimelist.net/ownlist/anime/32867/edit?hideLayout=1" title="Watching" class="Lightbox_AddEdit button_edit btn-anime-watch-status js-anime-watch-status watching">CW</a>
</div>
<div class="genres js-genre" id="32867">
<div class="genres-inner js-genre-inner"><span class="genre">
<a href="/anime/genre/7/Mystery" title="Mystery">Mystery</a>
</span><span class="genre">
<a href="/anime/genre/42/Seinen" title="Seinen">Seinen</a>
</span><span class="genre">
<a href="/anime/genre/37/Supernatural" title="Supernatural">Supernatural</a>
</span></div>
</div>
</div>
<div class="image lazyload" data-bg="https://myanimelist.cdn-dena.com/images/anime/4/82293.webp">
<a href="https://myanimelist.net/anime/32867/Bungou_Stray_Dogs_2nd_Season" class="link-image">Bungou Stray Dogs 2nd Season</a>
</div>
<div class="synopsis js-synopsis">
<span class="preline">Nakajima Atsushi was kicked out of his orphanage, and now he has no place to go and no food. While he is standing by a river, on the brink of starvation, he rescues a man whimsically attempting suicide. That man is Dazai Osamu, and he and his partner Kunikida are members of a very special detective agency. They have supernatural powers and deal with cases that are too dangerous for the police or the military. They're tracking down a tiger that has appeared in the area recently, around the time Atsushi came to the area. The tiger seems to have a connection to Atsushi, and by the time the case is solved, it is clear that Atsushi's future will involve much more of Dazai and the rest of the detectives!
(Source: MangaHelpers)</span>
<p class="licensors" data-licensors=""></p>
</div>
<div class="information">
<div class="info">
TV -
<span class="remain-time">
Oct 6, 2016, 22:30 (JST) </span>
</div>
<div class="scormem">
<span class="member fl-r" title="Members">
72,011
</span>
<span class="score" title="Score">
8.34
</span>
</div>
</div>
</div>
我想获得Bungou Stray Dogs第二季的文字。我还希望获得data-bg值(https://myanimelist.cdn-dena.com/images/anime/4/82293.webp)并得到分数(8.34),但是对于每对数据都是如此。
我不确定使用Jsoup运行什么查询,因为我对HTML仍然很新,并且不太了解它。
运行此代码并不能解决任何问题:
Document doc = Jsoup.connect("https://myanimelist.net/anime/season").get();
Elements shows = doc.select("div:contains(image.lazyload)");
int i = 0;
for(Element show : shows){
System.out.println(i+". "+show.text());
i++;
}
答案 0 :(得分:1)
您只需遵循记录here的select
语法。
例如,要获取链接图像文本:
Elements imageLinks = doc.select("a.link-image");
另外两个是相似的。我相信你能搞清楚。