我正在编写一个程序来提取段落标签之间的数据
<h2>User Reviews</h2>
<div class="user-comments">
<div class="tinystarbar" title="2/10">
<div style="width: 20px;"> </div>
</div>
<span itemprop="review" itemscope="" itemtype="http://schema.org/Review">
<strong itemprop="name">Terrible movie</strong>
<span itemprop="reviewRating" itemscope="" itemtype="http://schema.org/Rating">
<meta itemprop="worstRating" content="1">
<meta itemprop="ratingValue" content="2">
<meta itemprop="bestRating" content="10">
</span>
<div class="comment-meta">
22 December 2013 | by <a href="/user/ur49033470/?ref_=tt_urv"><span itemprop="author">sarconus</span></a>
<meta itemprop="datePublished" content="2013-12-22">
– <a href="/user/ur49033470/comments?ref_=tt_urv">See all my reviews</a>
</div>
<div>
<p itemprop="reviewBody">This was one of the worst movies I have watched in quite sometime.The fist movie was fantastic and I still quote it to this day...<br><br>Sadly they played the dumb card the entire movie. Only funny parts were raciest. They couldn't make up their mind on what they wanted to do with this movie and brought in elements from the first that shouldn't have been touched.<br><br>Sorry this was a waste of time and money. The first movie will forever live in glory but this one will pass away.<br><br>If you loved the fist movie I would recommend waiting for DVD or just pass this one.</p>
</div>
</span>
<hr>
<div class="yn" id="ynd_2926802">
37 of 66 people found this review helpful.
Was this review helpful to you?
<button class="btn small" value="Yes" name="ynb_2926802_yes" onclick="CS.TMD.user_review_vote(2926802, 'tt1229340', 'yes');">Yes</button>
<button class="btn small" value="No" name="ynb_2926802_no" onclick="CS.TMD.user_review_vote(2926802, 'tt1229340', 'no');">No</button>
</div>
<div class="see-more">
<a href="/title/tt1229340/reviews-enter?ref_=tt_urv" rel="login" class="cboxElement">Review this title</a>
<span>|</span>
<a href="/title/tt1229340/reviews?ref_=tt_urv">See all 212 user reviews</a> »
</div>
</div>
在上面的html上执行javascript即时获取[对象列表]如何使用awesomium插件作为webbrowser获取变量中的数据
document.getElementsByTagName( “P”);用于提取
答案 0 :(得分:0)
var paragraphs = document.getElementsByTagName("p");
for (var i=0; i < paragraphs.length; i++) {
var p = paragraphs[i]; // this is the DOM element
var text = p.innerText; // this is the text inside the <p></p> tags
console.log(text);
}
答案 1 :(得分:0)
如果你正在使用dojo,你可以这样做
function stripTags(str) {
return domConstruct.create("div", { innerHTML: str }).textContent;
}
如果他们提供,你可以用其他框架做类似的事情。