我正在尝试抓取某个网站,我需要抓取评论数和最早的评论日期。但是,当没有评论时,它将作为我抓取的日期和时间返回,也显示在数据库中。但是我想要的是,由于没有评论,最早的日期字段应该为空。我的代码有什么问题吗?真的很感谢您的帮助,已经尝试了一个星期。谢谢!
// go to comment area
await page.waitForSelector("div.ivu-table-wrapper");
await page.waitFor(3000);
// get number of comments
const noOfComments = await page.$eval("#app-comment > div.comment-details > div:nth-child(4) > p", p => p.innerText.slice(16,-3));
// get the number of li
let len = await page.$$eval(".comment-details .ivu-page-item", e => {
return e.length;
});
// if there is only 2 pages
if (len == 2) {
len -= 1;
}
else if (len == 1) {
len -= 1;
}
// click on the last page (eg: len = 4, 4+2)
await page.click(".comment-details .ivu-page-item:nth-child(" + (len + 2) + ") > a").catch(async (err) => {
await page.click(".comment-details .ivu-page-item:nth-child(" + (len + 1) + ") > a");
});
await page.waitFor(7000);
// get the earliest comment date
const dates = await page.$eval("div.ivu-table-body > table > tbody > tr:last-child > td:nth-child(3)", td => td.innerText.trim()).catch(async (err) => {
console.log("");
});
const eDate = moment(dates).format('YYYY-MM-DD HH:mm:ss');
console.log("Rank: ", count); //int
console.log("Name: ", name); //string
console.log("Release Date: ", relDate); //date
console.log("Developer: ", developer); //string
console.log("Rating: ", rating); //float
console.log("Size: ", storage); //float
console.log("No. of Comments: ", noOfComments); //int
console.log("Earliest date: ", eDate); //datetime
console.log("Scrape date: ", today); //date
console.log("\n");
const data_values = [count, name, relDate, developer, rating, storage, noOfComments, eDate, today];
console.log(data_values);
connection.connect(function(err) {
var sql = "INSERT INTO Qimai_BS (ranking, name, release_date, developer, rating, storage_size, no_of_comments, earliest_date, scrape_date) VALUES ?";
var values = [data_values];
connection.query(sql, [values], function (err, result) {
if (err) throw err;
console.log("Rows Inserted: " + result.affectedRows);
});
});
答案 0 :(得分:0)
dates
将不确定。因此,问题出在以下行:
const eDate = moment(dates).format('YYYY-MM-DD HH:mm:ss');
在不带参数的情况下调用moment()
(或以undefined
作为参数)时,它将使用当前日期。因此,您需要做的是首先检查是否设置了dates
,例如:
const eDate = (!dates) ? null : moment(dates).format('YYYY-MM-DD HH:mm:ss');