多个选择器上的Puppeteer waitForSelector

时间:2018-04-20 17:15:38

标签: screen-scraping puppeteer

我让Puppeteer控制一个带有查找表单的网站,该表单可以返回结果或“找不到记录”消息。我怎么知道哪个被退回? waitForSelector似乎一次只等待一个,而waitForNavigation似乎不起作用,因为它是使用Ajax返回的。 我正在使用try catch,但是要做到正确并且一切都慢下来都很棘手。

public void DeleteGoogleCloudSave(){
    #if UNITY_ANDROID && !UNITY_EDITOR
    Debug.Log("Opening save...");

    if (isGooglePlayGamesConnected()){
        ((PlayGamesPlatform)Social.Active).SavedGame.OpenWithAutomaticConflictResolution("Filename", GooglePlayGames.BasicApi.DataSource.ReadCacheOrNetwork,ConflictResolutionStrategy.UseLongestPlaytime, DeleteSavedGameOpened);
    }
    #endif
}

void DeleteSavedGameOpened(SavedGameRequestStatus status, ISavedGameMetadata meta)
{
    #if UNITY_ANDROID && !UNITY_EDITOR

    Debug.Log("Running SaveGameOpened...");

    if(status == SavedGameRequestStatus.Success)
    {
       DeleteSavedGame();
    }
    #endif
}

void DeleteSavedGame() {
    ISavedGameClient savedGameClient = PlayGamesPlatform.Instance.SavedGame;
    savedGameClient.OpenWithAutomaticConflictResolution("Filename", DataSource.ReadCacheOrNetwork,ConflictResolutionStrategy.UseLongestPlaytime, OnDeleteSavedGame);
}

public void OnDeleteSavedGame(SavedGameRequestStatus status, ISavedGameMetadata game) {
    ISavedGameClient savedGameClient = PlayGamesPlatform.Instance.SavedGame;
    if (status == SavedGameRequestStatus.Success) {
        // delete the game.
        savedGameClient.Delete(game);
        Debug.Log("Google Cloud Save Game has been deleted...");
    } else {
        // handle error
        Debug.LogError("Google Cloud Save Game has NOT been deleted...");
    }
}

9 个答案:

答案 0 :(得分:7)

如何像在下面的代码片段中一样使用Promise.race(),并且不要忘记{ visible: true }方法中的page.waitForSelector()选项。

public async enterUsername(username:string) : Promise<void> {
    const un = await Promise.race([
        this.page.waitForSelector(selector_1, { timeout: 4000, visible: true })
        .catch(),
        this.page.waitForSelector(selector_2, { timeout: 4000, visible: true })
        .catch(),
    ]);

    await un.focus();
    await un.type(username);
}

答案 1 :(得分:4)

使任何元素存在

您可以一起使用querySelectorAllwaitFor来解决此问题。使用带逗号的所有选择器将返回与任何选择器匹配的所有节点。

await page.waitFor(() => 
  document.querySelectorAll('Selector1, Selector2, Selector3').length
);

现在这只会返回true如果有一些元素,它将不会返回哪个选择器匹配哪些元素。

答案 2 :(得分:3)

使用Md.Abu Taher的建议,我最终得到了这个:

// One of these SELECTORs should appear, we don't know which
await page.waitForFunction((sel) => { 
    return document.querySelectorAll(sel).length;
},{timeout:10000},SELECTOR1 + ", " + SELECTOR2); 

// Now see which one appeared:
try {
    await page.waitForSelector(SELECTOR1,{timeout:10});
}
catch(err) {
    //check for "not found" 
    let ErrMsg = await page.evaluate((sel) => {
        let element = document.querySelector(sel);
        return element? element.innerHTML: null;
    },SELECTOR2);
    if(ErrMsg){
        //SELECTOR2 found
    }else{
        //Neither found, try adjusting timeouts until you never get this...
    }
};
//SELECTOR1 found

答案 3 :(得分:2)

在 puppeteer 中,您可以简单地使用多个由 coma 分隔的选择器,如下所示:

const foundElement = await page.waitForSelector('.class_1, .class_2');

返回的元素将是在页面中找到的第一个元素的 elementHandle。

接下来如果你想知道找到了哪个元素,你可以像这样获取类名:

const className = await page.evaluate(el => el.className, foundElement);

在您的情况下,类似于此的代码应该可以工作:

const foundElement = await page.waitForSelector([SELECTOR1,SELECTOR2].join(','));
const responseMsg = await page.evaluate(el => el.innerText, foundElement);
if (responseMsg == "No records found"){ // Your code here }

答案 4 :(得分:1)

另一种简单的解决方案是从更CSS的角度进行处理。 waitForSelector似乎跟随CSS selector list rules。因此,从本质上讲,您只需使用逗号即可选择多个CSS元素。

try {    
    await page.waitForSelector('.selector1, .selector2',{timeout:1000})
} catch (error) {
    // handle error
}

答案 5 :(得分:0)

伪造者方法如果无法满足请求,则可能会引发错误。例如,如果选择器在给定时间段内不匹配任何节点,则page.waitForSelector(selector [,options])可能会失败。

对于某些类型的错误,Puppeteer使用特定的错误类别。这些类可通过require('puppeteer / Errors')获得。

支持的类列表:

TimeoutError

处理超时错误的示例:

const {TimeoutError} = require('puppeteer/Errors');

// ...

try {
  await page.waitForSelector('.foo');
} catch (e) {
  if (e instanceof TimeoutError) {
    // Do something if this is a timeout.
  }
}

答案 6 :(得分:0)

将上面的一些元素组合到一个辅助方法中,我构建了一个命令,该命令使我可以创建多个可能的选择器结果,并首先解决要解决的问题。

/**
 * @typedef {import('puppeteer').ElementHandle} PuppeteerElementHandle
 * @typedef {import('puppeteer').Page} PuppeteerPage
 */

/** Description of the function
  @callback OutcomeHandler
  @async
  @param {PuppeteerElementHandle} element matched element
  @returns {Promise<*>} can return anything, will be sent to handlePossibleOutcomes
*/

/**
 * @typedef {Object} PossibleOutcome
 * @property {string} selector The selector to trigger this outcome
 * @property {OutcomeHandler} handler handler will be called if selector is present
 */

/**
 * Waits for a number of selectors (Outcomes) on a Puppeteer page, and calls the handler on first to appear,
 * Outcome Handlers should be ordered by preference, as if multiple are present, only the first occuring handler
 * will be called.
 * @param {PuppeteerPage} page Puppeteer page object
 * @param {[PossibleOutcome]} outcomes each possible selector, and the handler you'd like called.
 * @returns {Promise<*>} returns the result from outcome handler
 */
async function handlePossibleOutcomes(page, outcomes)
{
  var outcomeSelectors = outcomes.map(outcome => {
    return outcome.selector;
  }).join(', ');
  return page.waitFor(outcomeSelectors)
  .then(_ => {
    let awaitables = [];
    outcomes.forEach(outcome => {
      let await = page.$(outcome.selector)
      .then(element => {
        if (element) {
          return [outcome, element];
        }
        return null;
      });
      awaitables.push(await);
    });
    return Promise.all(awaitables);
  })
  .then(checked => {
    let found = null;
    checked.forEach(check => {
      if(!check) return;
      if(found) return;
      let outcome = check[0];
      let element = check[1];
      let p = outcome.handler(element);
      found = p;
    });
    return found;
  });
}

要使用它,只需调用并提供可能的结果及其选择器/处理程序的数组即可:

 await handlePossibleOutcomes(page, [
    {
      selector: '#headerNavUserButton',
      handler: element => {
        console.log('Logged in',element);
        loggedIn = true;
        return true;
      }
    },
    {
      selector: '#email-login-password_error',
      handler: element => {
        console.log('password error',element);
        return false;
      }
    }
  ]).then(result => {
    if (result) {
      console.log('Logged in!',result);
    } else {
      console.log('Failed :(');
    }
  })

答案 7 :(得分:0)

我遇到了类似的问题,并寻求了一个简单的解决方案:

% perl -d padmin.pl

Loading DB routines from perl5db.pl version 1.28
Editor support available.

Enter h or `h h' for help, or `man perldebug' for more help.

main::(padmin.pl:18):   my $ACTION_STOP    = 'stop';
   DB<1> /init
59:     initFromCfgFile();

   DB<2> /
93:     sub initFromCfgFile {   


然后使用它:

helpers.waitForAnySelector = (page, selectors) => new Promise((resolve, reject) => {
  let hasFound = false
  selectors.forEach(selector => {
    page.waitFor(selector)
      .then(() => {
        if (!hasFound) {
          hasFound = true
          resolve(selector)
        }
      })
      .catch((error) => {
        // console.log('Error while looking up selector ' + selector, error.message)
      })
  })
})

答案 8 :(得分:0)

通过包装Promise.race()进一步使用// Typescript export async function racePromises(promises: Promise<any>[]): Promise<number> { const indexedPromises: Array<Promise<number>> = promises.map((promise, index) => new Promise<number>((resolve) => promise.then(() => resolve(index)))); return Promise.race(indexedPromises); } ,只需检查索引以了解进一步的逻辑:

// Javascript
export async function racePromises(promises) {
  const indexedPromises = promises.map((promise, index) => new Promise((resolve) => promise.then(() => resolve(index))));
  return Promise.race(indexedPromises);
}
const navOutcome = await racePromises([
  page.waitForSelector('SELECTOR1'),
  page.waitForSelector('SELECTOR2')
]);
if (navigationOutcome === 0) {
  //logic for 'SELECTOR1'
} else if (navigationOutcome === 1) {
  //logic for 'SELECTOR2'
}


用法:

{{1}}