如何从http响应中拆分二进制/后数据/ html?

时间:2012-03-23 18:18:45

标签: c http-headers

我正在写一个简单的函数,它从HTTP响应中拆分二进制/后数据/ html。 HTTP标头由\r\n\r\n终止,其余的是消息。

我写了这个:

 #define MAX_BUFFER_SIZE 256
    //... 
    int size = 0;
    int buf_size = MAX_BUFFER_SIZE;
    char * headers = malloc(MAX_BUFFER_SIZE);
    char * newbuf;

    while(httpresponse[size]) {
        if(httpresponse[size]     == '\r' &&
           httpresponse[size + 1] == '\n' &&
           httpresponse[size + 2] == '\r' &&
           httpresponse[size + 3] == '\n') {
            break;
        }

        headers[size] = httpresponse[size];     

        if(size >= buf_size) {
            buf_size += MAX_BUFFER_SIZE;
            newbuf = realloc(headers, buf_size);

           if(NULL ==  newbuf) exit(1);
           headers = newbuf;

         }

           size ++;
        }

        printf("%s\n", headers);

httpresponse变量,具有类似值:

HTTP/1.1 200 OK
Date: Fri, 23 Mar 2012 15:28:17 GMT
Expires: Sat, 23 Mar 2013 15:28:17 GMT
Cache-Control: public, max-age=31536000
Last-Modified: Thu, 14 Apr 2011 15:46:35 GMT
Content-Type: image/jpeg
Content-Length: 12745
X-XSS-Protection: 1; mode=block
Connection: close

���������I1��} �g������'�B�f�p���ohd]sft�����J�������1����瘿ٱ����$3�G�8��4=�E�i����ܼG����H��nbi�"�1��b[Ǘl��++���OPt�W��>�����i�]t�QT�N/,Q�Qz������0�`    N7���M��f��S�Š�x9k��$*

//more binary... 

但是上面的C程序,打印以下文字:

HTTP/1.1 200 OK
Date: Fri, 23 Mar 2012 17:12:09 GMT
Expires: Sat, 23 Mar 2013 17:12:09 GMT
Last-Modified: Thu, 14 Apr 2011 15:46:35 GMT
Content-Type: image/jpeg
Content-Length: 12745
X-XSS-Protection: 1; mode=block
Cache-Control: public, max-age=31536000
Age: 3746
�2�/���ms���|ނ����LQr2K3�v��J.�,�z��^Oy����s(ct���X`iA����I����U�{

而不是:

    HTTP/1.1 200 OK
    Date: Fri, 23 Mar 2012 15:28:17 GMT
    Expires: Sat, 23 Mar 2013 15:28:17 GMT
    Cache-Control: public, max-age=31536000
    Last-Modified: Thu, 14 Apr 2011 15:46:35 GMT
    Content-Type: image/jpeg
    Content-Length: 12745
    X-XSS-Protection: 1; mode=block
    Connection: close

如何解决这个问题?提前谢谢。

3 个答案:

答案 0 :(得分:1)

我使用以下代码:

const QByteArray& data = socket->readAll();
int index = data.indexOf("\r\n\r\n");
QString sHeader;
QString sBody;
if (index < 0)
    sHeader = QString::fromUtf8(data);
else
{
    sHeader = QString::fromUtf8(data.left(index));
    sBody = QString::fromUtf8(data.mid(index + 4));
}

QIssHttpRequestHeader requestHeader(sHeader); // QIssHttpRequestHeader is a copy of QHttpRequestHeader from Qt4

if (requestHeader.method() != "GET")
{
    send501Error(socket);
    return;
}

QUrl url(requestHeader.path());
...

答案 1 :(得分:0)

我认为你的代码没问题,只是malloc不会初始化它分配的内存。这意味着当您从while循环中断开时,您捕获的“headers”字符串没有空字节终止它。

两种修复方法 - ether使用“calloc”将内存初始化为二进制零或在调用printf之前插入一行代码:

 headers[size] = '\0'

干杯

答案 2 :(得分:0)

您确定输出吗?你说httpresponse,在这种情况下以X-XSS-Protection: 1; mode=blockConnection: close结尾,但你的输出结束如下:X-XSS-Protection: 1; mode=blockCache-Control: public, max-age=31536000 ?? !!?

如果您确定正确复制输出,那么看起来你的结果是不正确的。