如何将所有<p>标签写入文本文件</p>

时间:2013-07-13 17:28:39

标签: c++ qt qt-creator

我需要编写Qt / C ++代码来提取所有p标签,将每个p标签写入.txt文件,例如,如果我有以下HTML页面:

        <!DOCTYPE html>
        <html>
         <body>

         <h1>My First Heading</h1>

         <p>My first paragraph.</p>
         <p>My second paragraph.</p>

         </body>
          </html>

我需要代码来创建2 .txt文件,第一个文件将包含我的第一段。第二段将包括我的第二段。

我的问题如何解析html并获得标签之间的txt,这里是我的代码

         int main(int argc, char *argv[])
          {
            QCoreApplication a(argc, argv);

           QEventLoop loop;

            QNetworkRequest request;
             request.setUrl(QUrl("http://en.wikipedia.org/wiki/Cars"));
               QNetworkAccessManager* networkMgr = new QNetworkAccessManager();
                QNetworkReply* reply = networkMgr->get(request);

             QObject::connect(reply, SIGNAL(finished()),&loop,SLOT(quit()));

                        loop.exec();

                 QFile file ("/Users/David/Desktop/text123.txt");
                   file.open(QIODevice::WriteOnly);
                   file.write(reply->readAll());

                         delete reply;

                   return a.exec();
                     }

非常感谢你的帮助

  1. 列表项

1 个答案:

答案 0 :(得分:1)

您可以使用QRegularExpression参见下面的示例。

QString txt = reply->readAll();
QRegularExpression regex("< *[pP] *>(.*)< *\\/ *[pP] *>");
QRegularExpressionMatchIterator it = regex.globalMatch(txt);
int i = 0;
while(it.hasNext())
{
    QRegularExpressionMatch match = it.next();
    QString filename = QString("e:/folder/file%1.txt").arg(i);
    QFile file (filename);
    file.open(QIODevice::WriteOnly);
    file.write(match.captured(1).toUtf8());
    file.close();
    ++i;
}