服务器返回HTTP头和二进制文件;像这样的东西:
HTTP/1.1 200 OK
Date: Thu, 28 Jun 2012 22:11:14 GMT
Server: Apache/2.2.3 (Red Hat)
Set-Cookie: JSESSIONID=blabla; Path=/
Pragma: no-cache
Cache-Control: must-revalidate, no-store
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-disposition: inline; filename="foo.pdf"
Content-Length: 6231119
Connection: close
Content-Type: application/pdf
%PDF-1.6
%âãÏÓ
5989 0 obj
<</Linearized 1/L 6231119/O 5992/E 371504/N 1498/T 6111290/H [ 55176 6052]>>
endobj
xref
5989 2744
0000000016 00000 n
0000061228 00000 n
0000061378 00000 n
我只想复制二进制文件。但是如何知道标题部分何时结束?我尝试检查该行是否包含\r\n\r\n
,但看起来此标准不适用于服务器响应,仅适用于客户端。这给出了:
Content-disposition: inline; filename="foo.pdf"
Content-Length: 6231119
Connection: close
Content-Type: application/pdf
%PDF-1.6
%âãÏÓ
5989 0 obj
<</Linearized 1/L 6231119/O 5992/E 371504/N 1498/T 6111290/H [ 55176 6052]>>
endobj
xref
5989 2744
0000000016 00000 n
这是C代码:
while((readed = recv(sock, buffer, 128, 0)) > 0) {
if(isnheader == 0 && strstr(buffer, "\r\n\r\n") != NULL)
isnheader = 1;
if(isnheader)
fwrite(buffer, 1, readed, fp);
}
更新
我将continue
控件放入我的if语句中:
if(isnheader == 0 && strstr(buffer, "\r\n\r\n") != NULL) {
isnheader = 1;
continue;
}
嗯,它按预期工作。但正如@Alnitak提到的那样,它并不安全。
答案 0 :(得分:17)
标题和正文应由\r\n\r\n
分隔(RFC 2616第4.1节)
但是有些服务器可能会省略\r
并且只发送\n
行,特别是如果他们无法清理任何CGI提供的标头以确保它们包含\r
。
您还需要考虑如何对读取进行分块 - 分隔符可能跨越您的128字节块,这将阻止strstr
调用停止工作。
答案 1 :(得分:2)
您没有正确解析输入。以下是您做错的一些事情:
我编写了一个快速函数,该函数应该找到HTTP头的末尾并将服务器的其余响应写入文件流:
void parse_http_headers(int s, FILE * fp)
{
int isnheader;
ssize_t readed;
size_t len;
size_t offset;
size_t pos;
char buffer[1024];
char * eol; // end of line
char * bol; // beginning of line
isnheader = 0;
len = 0;
// read next chunk from socket
while((readed = read(s, &buffer[len], (1023-len))) > 0)
{
// write rest of data to FILE stream
if (isnheader != 0)
fwrite(buffer, 1, readed, fp);
// process headers
if (isnheader == 0)
{
// calculate combined length of unprocessed data and new data
len += readed;
// NULL terminate buffer for string functions
buffer[len] = '\0';
// checks if the header break happened to be the first line of the
// buffer
if (!(strncmp(buffer, "\r\n", 2)))
{
if (len > 2)
fwrite(buffer, 1, (len-2), fp);
continue;
};
if (!(strncmp(buffer, "\n", 1)))
{
if (len > 1)
fwrite(buffer, 1, (len-1), fp);
continue;
};
// process each line in buffer looking for header break
bol = buffer;
while((eol = index(bol, '\n')) != NULL)
{
// update bol based upon the value of eol
bol = eol + 1;
// test if end of headers has been reached
if ( (!(strncmp(bol, "\r\n", 2))) || (!(strncmp(bol, "\n", 1))) )
{
// note that end of headers has been reached
isnheader = 1;
// update the value of bol to reflect the beginning of the line
// immediately after the headers
if (bol[0] != '\n')
bol += 1;
bol += 1;
// calculate the amount of data remaining in the buffer
len = len - (bol - buffer);
// write remaining data to FILE stream
if (len > 0)
fwrite(bol, 1, len, fp);
// reset length of left over data to zero and continue processing
// non-header information
len = 0;
};
};
if (isnheader == 0)
{
// shift data remaining in buffer to beginning of buffer
offset = (bol - buffer);
for(pos = 0; pos < offset; pos++)
buffer[pos] = buffer[offset + pos];
// save amount of unprocessed data remaining in buffer
len = offset;
};
};
};
return;
}
我没有测试过代码,所以它可能有简单的错误,但它应该指向正确的方向从C中的缓冲区解析字符串数据。