如何使用puppeteer访问iframe #document?

时间:2019-06-19 13:57:00

标签: javascript typescript web-scraping puppeteer

我正在尝试抓取动漫视频页面[jkanime],但由于mp4视频格式位于iframe #document中,因此我遇到了问题。

在chrome开发工具中,我输入了以下内容: $('#jkvideo_html5_api源').src

mp4的src向我显示。但是我不知道如何应用查询* $('#jkvideo_html5_api source')。src *和木偶。

现在...我想要实现的是如何获取_navigationURL的值,然后发出请求并引用mp4视频源。

任何帮助将不胜感激!!

图片

devtool source code section

  const getAnimeVideo = async (id: string, chapter: number) => {
    const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
    const browser = await puppeteer.launch() 
    const page = await browser.newPage()
    await page.goto(BASE_URL);
    const elementHandle = await page.$('.player_conte')
    const frame = await elementHandle.contentFrame();
    const $ = cheerio.load(`${frame}`);
    console.log(frame)
 }

获得部分输出

....
OMWorld {
     _frameManager:
      FrameManager {
        _events: [Object],
        _eventsCount: 3,
        _maxListeners: undefined,
        _client: [CDPSession],
        _page: [Page],
        _networkManager: [NetworkManager],
        _timeoutSettings: [TimeoutSettings],
        _frames: [Map],
        _contextIdToContext: [Map],
        _isolatedWorlds: [Set],
        _mainFrame: [Frame] },
     _frame: [Circular],
     _timeoutSettings:
      TimeoutSettings { _defaultTimeout: null, _defaultNavigationTimeout: null },     _documentPromise: null,
     _contextResolveCallback: null,
     _contextPromise: Promise { [ExecutionContext] },
     _waitTasks: Set {},
     _detached: false },
  _childFrames: Set {},
  _name: '',
  _navigationURL:
   'https://jkanime.net/um.php?e=Q0VxeUQ2MmZRRlNWeUdHKzdoWlJQOGFLNjFRUnljVkFTaEtFMElZUjFmTlRPQnhnUUtqbnRodjhEVHlGYnVleWJsdnNnRy9wNzVLd0MrMURuRVBKV0tQZjVuT0tIblc3cUNmZDNzdFVFaEE9OjrIf8cc_60GOGTTN7Th9Q_a' }

我想要获得的输出

   {
     "src": [
       "https://storage.googleapis.com/markesito.appspot.com/tokgho/01.mp4"
     ]
   }

问题已解决:上午11:34

  const getAnimeVideo = async (id: string, chapter: number) => {
    const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
    const browser = await puppeteer.launch() 
  const page = await browser.newPage()
  await page.goto(BASE_URL);
  const elementHandle = await page.$('.player_conte')
  const frame = await elementHandle.contentFrame();
  const video = await frame.$eval('#jkvideo_html5_api', el =>
  Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
  return video;
 }

1 个答案:

答案 0 :(得分:0)

const getAnimeVideo = async (id: string, chapter: number) => {
  const BASE_URL = `${url}${id}/${chapter}/`  // => https://jkanime.net/tokyo-ghoul/1/
  const browser = await puppeteer.launch() 
  const page = await browser.newPage()
  await page.goto(BASE_URL);
  const elementHandle = await page.$('.player_conte')
  const frame = await elementHandle.contentFrame();
  const video = await frame.$eval('#jkvideo_html5_api', el =>
  Array.from(el.getElementsByTagName('source')).map(e => e.getAttribute("src")));
  return video;
 }