这是我要解析的HTML:
<h2 class="offer-header">
<a class="offer-title" href="http://address.com/id/2">Item name</a>
</h2>
<div class="offer-price">
<span class="offer-buy-now buy-now">
<span class="statement">
1 999,00 $
<span class="label">buy now</span>
</span>
</span>
</div>
// many the same elements
解析href和链接值没关系。但我要解析价格有问题。我得到了很多空格的输出和\ n。我希望以buy now
显示相同的价格。
我的价格样本输出
2 497,00 $
buy now
2 379,00 $
buy now
代码:
request(task.url, function(err, resp, body){
if(body) {
$ = cheerio.load(body);
links = $('a.offer-title');
$(links).each(function (i, link) {
//console.log($(link).attr('href'));
var price = $('span.offer-buy-now').text();
console.log(price);
//items[k] = items[k] || [];
//items[k] = new itemParam($(link).text(), 12, k);
k++;
});
}
callback();
});
如何解决?
编辑:
我纠正了foreach循环并且它正常工作。但我有另一个问题。我并不总是得到数据的答案,只有3,4,5调用得到结果。也许我的请求有问题?
router.route('/send')
.post(function(req, res){
var url = req.body.url;
var items = [];
var k=0;
var q = async.queue(function(task, callback){
console.log(task.url);
if(task.url.length>=1) {
if (isURL(task.url)) {
console.log('OK');
request(task.url, function(err, resp, body){
if(body) {
$ = cheerio.load(body);
links = $('div.offer-info');
$(links).each(function (i, link) {
console.log($(link).find('a.offer-title').attr('href'));
var price = $(link).find('span.offer-buy-now').text().replace(/[^0-9.]/g, "");
console.log(price);
items[k] = items[k] || [];
items[k] = new itemParam($(link).find('a.offer-title').text(),
price,$(link).find('a.offer-title').attr('href'), k);
k++;
});
}
callback();
});
} else {
errorHandling(res, 401,"Invalid url");
}
}else{
errorHandling(res, 401,"Invalid url");
}
}
);
q.push({url: url+'&p=1'});
q.drain = function(errr, p) {
console.log('all items have been processed' + items.length);
for (var i=0; i<items.length; i++) {
console.log(items[i].name + ' | ' + items[i].id + ' | ' + items[i].price);
}
res.sendStatus(200);
};
});
答案 0 :(得分:1)
您可以使用以下方法删除数字以外的所有内容:
var price = $('span.offer-buy-now').text().replace(/[^0-9.]/g, "");
<强>样本:强>
var str = "2 497,00 $ buy now";
strreplaced = str.replace(/[^0-9.]/g, "");
alert(strreplaced);
&#13;
答案 1 :(得分:1)
现在只需使用replace方法删除“立即购买”,然后使用trim()删除空格。
Microsoft.Practices.unity.injectionmember
其他解决方案
或者您可以links = $('a.offer-title');
$(links).each(function(i, link) {
//console.log($(link).attr('href'));
var price = $('span.offer-buy-now').text().replace('buy now', '').trim();
console.log(price);
//items[k] = items[k] || [];
//items[k] = new itemParam($(link).text(), 12, k);
k++;
});
删除范围.statement
内的所有元素,然后就可以获得$('span.statement *').remove();
演示:
text
links = $('a.offer-title');
$(links).each(function(i, link) {
//console.log($(link).attr('href'));
$('span.statement *').remove();
var price = $('span.statement').text().trim();
console.log(price);
//items[k] = items[k] || [];
//items[k] = new itemParam($(link).text(), 12, k);
k++;
});