使用Puppeteer刮取Genius的歌词

时间:2020-05-30 19:55:23

标签: javascript puppeteer

嘿,我想使用pupeteer刮一些天才的歌词,我可以搜索特定歌曲并浏览歌词页面

但是当我想获取包含歌词的P标签时,会出现此错误

UnhandledPromiseRejectionWarning: TimeoutError:
waiting for selector ".lyrics p" failed:
timeout 30000ms exceeded

代码

  async function scrapelyrics() {
    const browser = await pupeteer.launch();
    const page = await browser.newPage();

    await page.goto('https://genius.com/');
    await page.waitFor('#application > div > div.PageHeaderdesktop__Container-bhx5ui-0.dmNhEr > form > input');
    await page.$eval('#application > div > div.PageHeaderdesktop__Container-bhx5ui-0.dmNhEr > form > input', el => el.value = 'delali');
    await page.click('#application > div > div.PageHeaderdesktop__Container-bhx5ui-0.dmNhEr > form > div');
    // await page.screenshot({path: 'buddy-screenshot.png'});
    await page.waitFor('body > routable-page > ng-outlet > search-results-page > div > div.column_layout > div.column_layout-column_span.column_layout-column_span--primary > div:nth-child(1) > search-result-section > div > div:nth-child(2) > search-result-items > div > search-result-item > div > mini-song-card > a');
    await page.click('body > routable-page > ng-outlet > search-results-page > div > div.column_layout > div.column_layout-column_span.column_layout-column_span--primary > div:nth-child(1) > search-result-section > div > div:nth-child(2) > search-result-items > div > search-result-item > div > mini-song-card > a');
    await page.waitFor('.lyrics p');
    await page.screenshot({
      path: 'buddy-screenshot.png'
    });
    await page.$eval('.lyrics p', (el) => {
      const text = el.textContent;
      console.log(text);
    })

    await browser.close();
  }

1 个答案:

答案 0 :(得分:0)

我解决了

问题是铬以某种方式打开了移动版本的页面,我使用puppeteer调试工具来解决该问题,它在有人刮刮任何页面有任何问题的情况下非常有用