我希望从<dd>
标签中提取内容,我想要获取p标签内容和ul标签内容我在php中使用preg_match_all尝试获取该html页面中<dd>
内的所有内容但是获取这不是我的HTML代码
<dd style="display: block;">
<p>Lightweight, comfy and cool - the dressy shirt he won\'t mind wearing!</p>
<ul>
<li>Made of 100% cotton</li>
<li>Specially treated for a soft feel</li>
<li>Classically styled with a pointed collar and button front</li>
<li>Chest pocket; curved shirttail hem</li>
<li>Canvas taping at inner neck</li>
<li>Imported</li>
</ul>
<div id="BVSecondaryCustomerRatings" style="display:none;margin-left: 15px" class="BVBrowserWebkit"> <div class="BVRRRootElement">
<div class="BVRRRatingSummary BVRRSecondaryRatingSummary">
<div class="BVRRRatingSummary BVRRPrimaryRatingSummary"><div class="BVRRRatingSummaryStyle2"><div class="BVRRRatingSummaryNoReviews"> <div id="BVRRRatingSummaryNoReviewsWriteImageLinkID" class="BVRRRatingSummaryLink BVRRRatingSummaryNoReviewsWriteImageLink">
<a name="BV_TrackingTag_Rating_Summary_2_WriteReview_I2613L0022" target="BVFrame" href="http://reviews.childrensplace.com/4154/I2613L0022/writereview.htm?format=embedded&campaignid=BV_RATING_SUMMARY_ZERO_REVIEWS&sessionparams=__BVSESSIONPARAMS__&return=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&innerreturn=http%3A%2F%2Freviews.childrensplace.com%2F4154%2FI2613L0022%2Freviews.htm%3Fformat%3Dembedded&user=__USERID__&authsourcetype=__AUTHTYPE__&submissionparams=__BVSUBMISSIONPARAMETERS__&submissionurl=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2FTCPCheckUserAuthenticationCmd%3FlangId%3D-1%26catalogId%3D10001%26storeId%3D10001"> <img src="http://reviews.childrensplace.com/static/4154/translucent.gif" alt="Write a review">
</a> </div>
<div id="BVRRRatingSummaryLinkWriteFirstID" class="BVRRRatingSummaryLink BVRRRatingSummaryLinkWriteFirst">
<span class="BVRRRatingSummaryLinkWriteFirstPrefix">Be the first to review this item.</span>
<a name="BV_TrackingTag_Rating_Summary_2_SocialBookmarkKaboodle_I2613L0022" target="_blank" class="BVRRSocialBookmarkingSharingLink BVRRSocialBookmarkingSharingLinkKaboodle" onclick="this.href=bvReplaceTokensInSocialURL(this.href);window.open(this.href,'','left=0,top=0,width=795,height=700,toolbar=1,location=0,resizable=1,scrollbars=1'); return false;" onfocus="this.href=bvReplaceTokensInSocialURL(this.href);" rel="nofollow" href="http://reviews.childrensplace.com/4154/share.htm?site=Kaboodle&url=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476&title=__TITLE__&robot=__ROBOT__&image=http%3A%2F%2Fcontent.childrensplace.com%2Fwww%2Fb%2FTCP%2Fimages%2Fstyles%2F188410_m.jpg" onmouseover="this.href=bvReplaceTokensInSocialURL(this.href);"><img width="16" height="16" class="BVRRSocialBookmarkLinkImage" src="http://reviews.childrensplace.com/static/4154/link-kaboodle.gif" alt="Kaboodle" title="Add To Kaboodle"></a>
</div></div></div></div> </div>
</div>
<p class="TCP-Phrase">Big Fashion, Little Prices</p>
<div id="product_social_icons" style="height: 20px;">
<div class="social_icon current_social">
<div class="twitter"><iframe scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.1336551279.html#_=1336767195241&count=horizontal&id=twitter-widget-0&lang=en&original_referer=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&size=m&text=The Childrens Place - plaid shirt&url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476" class="twitter-share-button twitter-count-horizontal" style="height: 20px; width: 90px;" title="Twitter Tweet Button"></iframe></div>
<div class="pinterest" id="pin_it">
<iframe scrolling="no" frameborder="0" src="http://pinit-cdn.pinterest.com/pinit.html?url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&media=//content.childrensplace.com/www/b/TCP/images/cloudzoom/p/188410_p.jpg&description=plaid shirt&layout=horizontal" style="border: medium none; width: 90px; height: 20px;"></iframe>
</div>
<div class="fb-like-btn" id="fb-root">
<script src="//connect.facebook.net/en_US/all.js#xfbml=1"></script>
<fb:like layout="button_count" show_faces="false" width="90" action="like" font="arial" colorscheme="light" fb-xfbml-state="rendered" class="fb_edge_widget_with_comment fb_iframe_widget"><span style="height: 20px; width: 76px;"><iframe id="f111d3371c" name="f5f7b234c" scrolling="no" style="border: none; overflow: hidden; height: 20px; width: 76px;" title="Like this content on Facebook." class="fb_ltr" src="http://www.facebook.com/plugins/like.php?api_key=&locale=en_US&sdk=joey&channel_url=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D23%23cb%3Df11898a314%26origin%3Dhttp%253A%252F%252Fwww.childrensplace.com%252Ff210aed7%26domain%3Dwww.childrensplace.com%26relation%3Dparent.parent&href=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&node_type=link&width=90&font=arial&layout=button_count&colorscheme=light&action=like&show_faces=false&extended_social_context=false"></iframe></span></fb:like></div>
</div>
</div>
</dd>
我用谷歌搜索了很多来解决这个问题我尝试使用dom解析但客户端需要正则表达式解析而不是..
答案 0 :(得分:1)
不要使用正则表达式解析html,这是错误的。 尝试使用simplexml,如果这对你来说太多了,请尝试查询路径:http://querypath.org/
答案 1 :(得分:1)
这是一个答案,并没有告诉你你的做法在道德上是错误的:
$pattern = "/<dd.*?>.*?<p>(.*?)<\/p>.*?<ul>(.*?)<\/ul>/s";
if (preg_match($pattern, $html, $matches)) {
echo "P-tag content: ".$matches[1];
echo "<br>";
echo "UL-tag content: ".$matches[2];
}
我使用您发布的HTML进行了测试,但它确实有效。