我正在尝试提取在分类广告网站(http://trademe.co.nz/Trade-Me-Motors/Cars/Toyota/Hiace/auction-300294634.htm)上提问的用户。由于某些原因,我使用的模式并不总是如此,所以如果你能帮助我完美的正则表达式,我将不胜感激 这是我目前的代码
/get memberid of the question asker $pattern = "//m"; preg_match_all($pattern, $htmlContent, $member_match); $no_a = count($member_match[1];); $inc = 0; echo "number of askers is $no_a"; //make loop to get all the members while($inc "; //get member user match based on the member_id $pattern2 = "/(.*)/"; preg_match_all($pattern2, $htmlContent, $member_user_match); $bid_user_q = $member_user_match[1][0]; //store the askers mysql_query("INSERT INTO askers (id, item_number, bid_user_q, bid_member_id_q, sub_cat) VALUES('', '$item_number', '$bid_user_q', '$bid_member_id_q', '$sub_cat')"); echo "INSERT INTO askers (id, item_number, bid_user_q, bid_member_id_q) VALUES('', '$item_number', '$bid_user_q', '$bid_member_id_q', '$sub_cat')
"; mysql_error(); $inc++; }
由于模式中的html标记,代码似乎无法正常显示,因此您可以在此处查看http://pastebin.com/iPxizy5X
答案 0 :(得分:0)
我怀疑它是“完美的”,但这个对我有用:
/<small>\s*<a href=\"\/Members\/Listings\.aspx\?member=(\d+)\">\s*<b>(.*?)<\/b>/
如果您使用:
$pattern = "/<small>\s*<a href=\"\/Members\/Listings\.aspx\?member=(\d+)\">\s*<b>(.*?)<\/b>/";
preg_match_all($pattern, $htmlContent, $member_match, PREG_SET_ORDER);
$ member_match [0] [1] =会员ID $ member_match [0] [2] =会员昵称 $ member_match [1] [1] =会员ID $ member_match [1] [2] =会员尼克