Question

我正在开发需要抓取IEEE网站的项目。我使用BeautifulSoup来执行此操作。这是我的代码：

import bs4 as bs
import sys
from PyQt5.QtWidgets import QApplication
from PyQt5.QtCore import QUrl
from PyQt5.QtWebKitWidgets import QWebPage

class Client(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self.on_page_load) 
        self.mainFrame().load(QUrl(url))
        self.app.exec_()

    def on_page_load(self):
        self.app.quit() 

url = 'https://ieeexplore.ieee.org/Xplore/home.jsp'
client_response = Client(url)
source = client_response.mainFrame().toHtml() # capture what browser show
soup = bs.BeautifulSoup(source, 'lxml')
for e in soup.find('div'):
    print(e.text)

然而，在我执行这组代码后，控制台显示它正在运行但没有任何回应我（现在，控制台仍在运行）。

你可以帮我解决这个问题吗？

提前致谢

如何抓取IEEE

0 个答案: