使用boost asio获取网页

时间:2017-12-08 02:33:34

标签: c++ boost boost-asio

我正在尝试构建一个程序,该程序将接受股票代码,运行谷歌搜索,输出数据(当前价格,高,低,变化百分比等)。我正在尝试使用boost asio,它不会从服务器返回任何数据。

#include "stdafx.h"
#include <iostream>
#include <istream>
#include <ostream>
#include <string>
#include <boost/asio.hpp>

std::string getStockPage(std::string ticker) {
    boost::asio::ip::tcp::iostream stream;

    stream.connect("www.google.com", "http");
    std::cout << "connected\n";
    stream << "GET /search?q=" << ticker << " HTTP/1.1\r\n";
    stream << "Host: www.google.com\r\n";
    stream << "Cache-Control: no-cache\r\n";
    //stream << "Content-Type: application/x-www-form-urlencoded\r\n\r\n";
    stream << "Connection: close\r\n\r\n";
    std::cout << "sent\n";

    std::ostringstream os;
    //os << stream.rdbuf();
    char buffer[100];
    os << stream.readsome(buffer, 100);
    return std::string(buffer, 100);
}

int main() {
    std::cout << getStockPage("$tsla");
    std::cout << "done\n";
    std::string temp;
    std::getline(std::cin, temp);
    return 0;


}

我试图只读取前100个字符,看它是否有输出响应的问题,但它只输出空字符。我希望它能够输出整个谷歌页面“www.google.com/search?q=$tsla”

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:1)

允许

std::istream::readsome始终返回0个字节。然后,看起来好像你收到了NUL字节,因为你做了

return std::string(buffer, 100);

而不是

return std::string(buffer, stream.gcount());

真的,只需使用其他方法

std::ostringstream os;
os << stream.rdbuf();
return os.str();

这在测试时对我有用。请注意,您可以添加同花顺:

stream << "Connection: close\r\n\r\n" << std::flush;

结果程序

#include <boost/asio.hpp>
#include <iostream>
#include <string>

std::string getStockPage(std::string const& ticker) {
    boost::asio::ip::tcp::iostream stream;

    stream.connect("www.google.com", "http");
    stream    << "GET /search?q=" << ticker << " HTTP/1.1\r\n";
    stream    << "Host: www.google.com\r\n";
    stream    << "Cache-Control: no-cache\r\n";
    // stream << "Content-Type: application/x-www-form-urlencoded\r\n\r\n";
    stream    << "Connection: close\r\n\r\n" << std::flush;

    std::ostringstream os;
    os << stream.rdbuf();
    return os.str();
}

int main() {
    std::cout << getStockPage("$tsla");
}

正在打印

HTTP/1.1 302 Found
Location: http://www.google.nl/search?q=%24tsla&gws_rd=cr&dcr=0&ei=3EMqWrKxCILUwAKv9LqICg
Cache-Control: private
Content-Type: text/html; charset=UTF-8
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Date: Fri, 08 Dec 2017 07:48:44 GMT
Server: gws
Content-Length: 288
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=118=MsVZZpoZFEz4mQDqDuuWFRViB8v8yEQju7FPdOw8Rr7ViQ1cJtF6ZeN9u-dSRhGMT4x8F8yDilk9FqsoTkO8IsoQX-YvHXRcCoHcOLk0p4VOTn8AZoldKeh84Ryl0bM0; expires=Sat, 09-Jun-2018 07:48:44 GMT; path=/; domain=.google.com; HttpOnly
Connection: close

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.nl/search?q=%24tsla&amp;gws_rd=cr&amp;dcr=0&amp;ei=3EMqWrKxCILUwAKv9LqICg">here</A>.
</BODY></HTML>