Puppeteer在Google Cloud Functions上执行缓慢

时间:2019-04-04 16:46:26

标签: javascript web-scraping google-cloud-functions chromium puppeteer

我正在Google Cloud Functions上使用Puppeteer。

经过几次测试,我发现我的代码在Google Cloud Functions基础架构上部署时平均大约需要56秒,而在本地测试的同一功能仅需13秒。

index.js

const chromium = require('chrome-aws-lambda');
const puppeteer = require('puppeteer-core');
const functions = require('firebase-functions');

exports.check = functions.https.onRequest(async (req, res) => {
    const License = req.query.License;

    browser = await puppeteer.launch({
        args: chromium.args,
        defaultViewport: chromium.defaultViewport,
        executablePath: await chromium.executablePath,
        headless: chromium.headless,
      });
    const page = await browser.newPage();

    await page.goto('http://www.example.com', {waitUntil: 'networkidle2'});
    await page.focus('#txtUserName');
    await page.keyboard.type('testUsername');
    await page.focus('#txtPassword');
    await page.keyboard.type('123123');
    await page.click('#btnLogin');
    await page.waitForSelector('#ctl00_400_header_400')
    //console.log("[✓]login successfully.")
    await page.evaluate(() => document.querySelector('#ctl00_400_header_400').click());
    await page.waitForSelector('#__tab_ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim')
    //console.log("[✓]Enquriy page loaded successfully")
    await page.evaluate(() => document.querySelector('#__tab_ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim').click());
    await page.waitForSelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_rdvehicleSearchLicense')
    //console.log("[✓]Claim section loaded successfully")
    await page.evaluate(() => document.querySelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_rdvehicleSearchLicense').click());
    //console.log("[✓]License tap loaded successfully")
    await page.waitForSelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_txtclaimSearchPersonLicNo');
    await page.focus('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_txtclaimSearchPersonLicNo');
    await page.keyboard.type(License);
    await page.evaluate(() => document.querySelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_btnVheicleSearchButtonClaim').click());    

    try {
        await page.waitForSelector('#ctl00_ContentPlaceHolder1_lblErrMessage')
        const textContent = await page.evaluate(() => document.querySelector('#ctl00_ContentPlaceHolder1_lblErrMessage').textContent);
        res.status(200).send( 'Result => ' + textContent );
        await browser.close();
    } catch (error) {
        //console.log("The element didn't appear.")
    }    

    try {
        await page.waitForSelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_grdClaimDraftSp > tbody > tr:nth-child(3) > td')
        const textContent = await page.evaluate(() => document.querySelector('#ctl00_ContentPlaceHolder1_tabQuickSearch_vehicleSerachClaim_grdClaimDraftSp > tbody > tr:nth-child(3) > td').textContent);
        res.status(200).send( 'Result => ' + textContent );
        await browser.close();
    } catch (error) {
        //console.log("The element didn't appear.")
    }   

});

Package.json

{
    "name": "functions",
    "version": "0.0.1",
    "description": "Cloud Functions for Firebase",
    "dependencies": {
      "chrome-aws-lambda": "1.14.0",
      "firebase-functions": "2.2.0",
      "iltorb": "2.4.2",
      "puppeteer-core": "1.14.0",
      "firebase-admin": "7.2.0"
    },
    "engines": {
      "node": "8"
    },
    "private": true
  }

使用分配的NodeJS 8和2 GB内存与Firebase功能一起部署。

如何改善代码以加快执行时间?

1 个答案:

答案 0 :(得分:0)

我不希望任何给定代码在Cloud Functions中像在任何现代台式机上一样快地运行,尤其是没有像Puppeteer(本质上运行Chrome)那样复杂的东西。

GCF仅将单个CPU分配给任何给定的服务器实例。它没有GPU。 GCF用于不需要大量计算的简单工作。台式机通常具有4-8个内核(或更多)和一个可帮助Chrome快速运行的GPU。这两种情况之间确实无法进行比较。

最重要的是,对于此代码,您无法做很多事情来加快其速度以使其与桌面体验保持一致。