我用node
和puppeteer
编写了一个脚本,以抓取在website中遍历多个页面的不同机构的名称。
我的以下脚本可以从登录页面解析机构名称,然后单击几下,同时从其他页面解析名称,最终在执行过程中的某个时刻遇到错误。
the error: TypeError: Cannot read property 'click' of undefined
at main (c:\Users\WCS\Desktop\Node vault\comments.js:18:25)
at <anonymous>
at process._tickCallback (internal/process/next_tick.js:118:7)
我使用了经过编码的for loop
,因为我真的没有任何想法让脚本继续单击下一页按钮,直到没有剩余按钮为止。我希望遵守任何逻辑,以便我的脚本将首先查找下一页按钮。如果找到一个,则将单击该按钮并重复该过程。
我尝试过:
const puppeteer = require('puppeteer');
const link = "https://www.incometaxindia.gov.in/Pages/utilities/exempted-institutions.aspx";
(async function main() {
try {
const browser = await puppeteer.launch({headless:false});
const [page] = await browser.pages();
await page.goto(link);
await page.waitForSelector("h1.faqsno-heading");
for(let i = 1; i < 20; i++){
const sections = await page.$$("h1.faqsno-heading");
for (const section of sections) {
const itemName = await section.$eval("div[id^='arrowex']", el => el.innerText);
console.log(itemName);
}
const nextPage = await page.$$(".ms-paging > a");
await nextPage[i].click();
await page.waitForNavigation({waituntil:'networkidle0'});
}
await browser.close();
} catch (e) {
console.log('the error: ', e);
}
})();
顺便说一句,要使这篇文章免于重复,我必须承认我遇到过this post,但我认为我自己无法在脚本中实现逻辑。
答案 0 :(得分:1)
您是否尝试过简单的if
条件?
const nextPage = await page.$$(".ms-paging > a");
if(nextPage && nextPage[i]){
await nextPage[i].click();
await page.waitForNavigation({waituntil:'networkidle0'});
}
这样,只有在有按钮的情况下,它才会单击。
答案 1 :(得分:1)
替换此代码
const nextPage = await page.$$(".ms-paging > a");
await nextPage[i].click();
await page.waitForNavigation({waituntil:'networkidle0'});
与此
await page.click("[title='Next Page']")
await page.waitForNavigation({waituntil:'networkidle0'})
const puppeteer = require('puppeteer');
const link = "https://www.incometaxindia.gov.in/Pages/utilities/exempted-institutions.aspx";
(async function main() {
try {
const browser = await puppeteer.launch({headless:false});
const [page] = await browser.pages();
await page.goto(link);
await page.waitForSelector("h1.faqsno-heading");
let j=0;
let NoOfPage=9 // adjust here to get number of pages
for(let i = 0; j<NoOfPage+1; i++,j++){
if (j>4) {
i=4;
}
if (i>0) {
await page.waitForSelector("h1.faqsno-heading",{visible:true});
const sections = await page.$$("h1.faqsno-heading");
for (const section of sections) {
const itemName = await section.$eval("div[id^='arrowex']", el => el.innerText);
console.log(itemName);
}
}
const nextPage= await page.$$(".ms-paging > a");
await Promise.all([
await nextPage[i].click(),
await page.waitForNavigation({waituntil:'networkidle0'}),
])
}
await browser.close();
} catch (e) {
console.log('the error: ', e);
}
})();
C:\NodeJS\PuppeteerTest\Pup>node stack56652523.js
....
....
HAPPY PUBLIC SCHOOL SAMITI
AABAH3894H
SAGRADA FAMILIA SOCIETY, SOUTH GOA
AAWAS5165K
K V DEVADIGA CHARITABLE TRUST, DAKSHINA KANNADA
AADTK1517B
SHRINE OF INFANT JESUS, CHICKMAGLUR
AAVTS1925P
SRI NANDI VEDACURU CHARITABLE, TRUST
AATTS1842D
SHREE SUBRAHMANYA VANGMAYEE PARISHAD, GOA
AAPTS2410M
SHREE SUBRAHMANYA VANGMAYEE PARISHAD, GOA
AAPTS2410M
WORD FOR THE WORLD FELLOWSHIP
AAAAW6295Q
JANA SEVA TRUST
AACTJ0594Q
VAGDEVI VILAS EDUCATIONAL AND CHARITABLE TRUST
AABTV8264G