pyppeteer浏览器永远不会关闭,并且TimeoutError会引发

时间:2020-05-17 06:02:25

标签: python browser web-crawler pyppeteer

我正在尝试使用Python Pyppeteer获得XHR。这是我的代码。

library(shiny)
library(DT)
library(plotly)
library(dplyr)

ui <- navbarPage(title = "Plotly Restyle",
                 id = "nav",
                 tabPanel(title = "Main Tab", 
                          DT::dataTableOutput("tableLT", width = "100%", height = "100%"),
                          plotlyOutput("plotly_graph", width = "100%", height = "100%")
                 )
)

server <- function(input, output, session) {

  set.seed(12345)
  scatter.data <- data.frame(ID = rep(as.character(1:95), each = 10), 
                             Year = rep(1991:2000, 95), 
                             Value = rep(rnorm(n = 95, mean = 5, sd = 1.5), 10), 
                             stringsAsFactors = FALSE
  )

  output$tableLT <- DT::renderDataTable({

    datatable(scatter.data, 
              colnames = c("ID", "Year", "Value"),
              class = "cell-border compact hover", 
              selection = "single"
    )
  })

  output$plotly_graph <- renderPlotly({

    layout(
      scatter.data %>%
        group_by(ID) %>%
        plot_ly(x = ~Year, 
                y = ~Value, 
                type = "scatter", 
                line = list(color = "black"),
                opacity = 0.1,
                mode = "lines",
                name = ~ID,
                hoverinfo = "text",
                hovertext = ~ID, 
                showlegend = FALSE
        ),
      xaxis = list(title = "Year", zeroline = FALSE),
      yaxis = list(title = "Value", zeroline = FALSE)
    )
  })

  observe({  

    clickDT <- input$tableLT_row_last_clicked     # Capturing the click of the table row

    if (is.null(clickDT)) {
      return()
    } else if (!is.null(clickDT)) {

      # Even though only one row in the table is selected, all lines in plot_ly are modified
      plotlyProxy("plotly_graph") %>%
        plotlyProxyInvoke("restyle",
                          line = list(x = list(clickDT),   # Not sure what goes wrong here 
                                      y = list(clickDT),
                                      color = "red", 
                                      width = 2), 
                          mode = "lines", 
                          type = "scatter",
                          list(clickDT)
        )
    }
  })
}

shinyApp(ui = ui, server = server)

但是,当我运行它时,浏览器永远不会关闭,并且在30秒之后,它会给我一个TimeoutError。代码应该返回xhr响应的url,但不是。

1 个答案:

答案 0 :(得分:0)

出现此问题的原因是此版本的pyppeteer使用的事件发射器不支持async事件订阅者。正在开发中的库的下一个版本(在撰写本文时)将允许这样做。

def intercept_response(res):
    async def intercept_response(res):
        resourceType = res.request.resourceType
        if resourceType in ['xhr']:
            resp = await res.text()
            try:
                r = json.loads(resp)
                print(res.request.url)
            except:
                pass
        return res.request.url
    asyncio.get_event_loop().run_until_complete(intercept_response(res))

第二,您的代码并非全部用于“返回xhr响应的网址”。您的函数main隐式返回None。仅仅因为您指定了事件处理程序并不意味着该参数的返回值是从您最初附加该处理程序的函数中神奇地返回的。尽管如此,这是完成我认为您要尝试执行的操作的一种方法:

async def main():
    browser = await launch(headless=False)
    page = await browser.newPage()
    resp_fut, interceptor = make_interceptor()
    page.on('response', interceptor)
    await page.goto('https://www.iesdouyin.com/share/user/70015326114')
    await page.waitForSelector('li.item goWork')
    resp = await resp_fut
    await browser.close()
    return resp

此解决方案并不是最好的解决方案,因为如果未设置将来的结果,它将无限期地挂起。您可能想看看asyncio.wait_for,或者更好的是,只需使用内置的Page.waitForRequest方法(;