木偶:会话已关闭。页面很可能已关闭

时间:2020-10-21 21:44:28

标签: node.js puppeteer

每时每刻,页面都会很挑剔,并且会出现以下错误:

UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
    at CDPSession.send (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:195:35)
    at ExecutionContext._evaluateInternal (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:200:50)
    at ExecutionContext.evaluate (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:106:27)
    at DOMWorld.evaluate (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/DOMWorld.js:79:24)
    at emitUnhandledRejectionWarning (internal/process/promises.js:149:15)
    at processPromiseRejections (internal/process/promises.js:211:11)
    at processTicksAndRejections (internal/process/task_queues.js:98:32)
(node:38857) Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
    at CDPSession.send (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:195:35)
    at ExecutionContext._evaluateInternal (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:200:50)
    at ExecutionContext.evaluate (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:106:27)
    at DOMWorld.evaluate (/Users/lancepollard/start/lancejpollard/data/node_modules/puppeteer/lib/cjs/puppeteer/common/DOMWorld.js:79:24)
(node:38857) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

为什么会发生这种情况以及如何解决?

1 个答案:

答案 0 :(得分:0)

我不知道为什么会出现此错误,您应该将代码添加到问题中。
对我而言,这就是我处理浏览器/页面错误的方法:

async initiate() {
    this.pageOptions = {
        waitUntil: 'networkidle2',
        timeout: 60000
    };
    puppeteerExtra.use(pluginStealth());
    this.browser = await puppeteerExtra.launch({ headless: false });
    const browserWSEndpoint = await this.browser.wsEndpoint();
    puppeteerExtra.connect({ browserWSEndpoint: browserWSEndpoint });
    this.page = await this.browser.newPage();
    await this.page.setRequestInterception(true);
    this.page.on('request', (request) => {
        if (['image', 'stylesheet', 'font', 'script'].indexOf(request.resourceType()) !== -1) {
            request.abort();
        } else {
            request.continue();
        }
    });
    this.page.on('dialog', async dialog => {
        await dialog.dismiss();
    });
}

wait = (ms) => new Promise(resolve => setTimeout(resolve, ms))

async restart() {
    await this.close();
    await this.wait(1000);
    this.initiate();
}

async close() {
    if (this.browser) {
        await this.page.close();
        await this.browser.close();
        this.browser = null;
        this.page = null;
        this.pageOptions = null;
    }
}

爬网过程:

 crawl(link, userAgent) {
    return new Promise(async (resolve, reject) => {
        if (reject) { }
        // Limit the runtime of this function in case of stuck URL crawling process.
        setTimeout(async () => {
            await this.restart();
            resolve(null);
            return;
        }, 60000);
        if (!userAgent) {
            userAgent = crawlUtils.getRandomUserAgent();
        }
        const crawlResults = { isValidPage: true, pageSource: null };
        if (!this.page) {
            await this.wait(1000);
            resolve(null);
            return;
        }
        try {
            await this.page.setUserAgent(userAgent);
            await this.page.goto(link, this.pageOptions);
            await this.page.waitForFunction(this.waitForFunction);
            crawlResults.pageSource = await this.page.content();
        }
        catch (error) {
            crawlResults.isValidPage = false;
        }
        if (this.isLinkCrawlTest) {
            await this.close();
        }
        resolve(crawlResults);
    });
}

希望它可以帮助您解决问题。