Qt - 获取互联网上托管的网页的源代码(HTML代码)

时间:2014-07-25 23:34:05

标签: html qt qnetworkaccessmanager qnetworkrequest qnetworkreply

我想获取网页的来源(HTML),例如StackOverflow的主页。

这是我到目前为止所编码的内容:

QNetworkAccessManager manager;
QNetworkReply *response = manager.get(QNetworkRequest(QUrl(url)));

QString html = response->readAll(); // Source should be stored here

但没有任何反应!当我尝试获取html字符串的值时,它是空的(“”)。

那么,该怎么办?我正在使用Qt 5.3.1。

3 个答案:

答案 0 :(得分:5)

QNetworkAccessManager异步工作。您在readAll()之后立即致电get(),但此时尚未提出请求。您需要使用documentation中显示的QNetworkAccessManager::finished信号,并将readAll()移至与此信号相关的插槽。

答案 1 :(得分:3)

您需要以异步方式对其进行编码。 C ++ 11和Qt来救援。请记住,lambda的主体将在事件循环中稍后执行。

// https://github.com/KubaO/stackoverflown/tree/master/questions/html-get-24965972
#include <QtNetwork>
#include <functional>

void htmlGet(const QUrl &url, const std::function<void(const QString&)> &fun) {
   QScopedPointer<QNetworkAccessManager> manager(new QNetworkAccessManager);
   QNetworkReply *response = manager->get(QNetworkRequest(QUrl(url)));
   QObject::connect(response, &QNetworkReply::finished, [response, fun]{
      response->deleteLater();
      response->manager()->deleteLater();
      if (response->error() != QNetworkReply::NoError) return;
      auto const contentType =
            response->header(QNetworkRequest::ContentTypeHeader).toString();
      static QRegularExpression re("charset=([!-~]+)");
      auto const match = re.match(contentType);
      if (!match.hasMatch() || 0 != match.captured(1).compare("utf-8", Qt::CaseInsensitive)) {
         qWarning() << "Content charsets other than utf-8 are not implemented yet:" << contentType;
         return;
      }
      auto const html = QString::fromUtf8(response->readAll());
      fun(html); // do something with the data
   }) && manager.take();
}

int main(int argc, char *argv[])
{
   QCoreApplication app(argc, argv);
   htmlGet({"http://www.google.com"}, [](const QString &body){ qDebug() << body; qApp->quit(); });
   return app.exec();
}

除非您仅使用此代码一次,否则应将QNetworkManager实例作为控制器类的成员,或main等。

答案 2 :(得分:2)

你必须在。之间添加QEventLoop。

QNetworkAccessManager manager;
QNetworkReply *response = manager.get(QNetworkRequest(QUrl(url)));
QEventLoop event;
connect(response,SIGNAL(finished()),&event,SLOT(quit()));
event.exec();
QString html = response->readAll(); // Source should be stored here