我目前正在尝试在mac上编写c ++代码,从网站下载更大的文件(~1GB)。我想我有一个错误,我将套接字缓冲区转换为字符串,导致我的结果文件(电影文件)有一些小块的nul字符遍布整个文件,我需要以某种方式从字符串optained中删除它们套接字缓冲区。
这是处理http连接的部分以及将日期保存到文件的部分。某些部件可能不在此示例中,如错误处理或完整的套接字构建。
//I have error handling in here but stripped out from this example
char buffer[512];
portno = atoi("8080");
sockfd = socket(AF_INET, SOCK_STREAM, 0);
server = gethostbyname(address);
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *)server->h_addr,
(char *)&serv_addr.sin_addr.s_addr,
server->h_length);
serv_addr.sin_port = htons(portno);
bzero(buffer,512);
header.copy(buffer,512);
n = write(sockfd,buffer,strlen(buffer));
std::string str_buff;
while((n = read(sockfd,buffer,511)) > 0){
std::string temp(buffer,511);
//Is this the error^^^^^^^^^?
write_chunk_to_file(temp);
//cut
void write_chunk_to_file(std::string chunk){
write.open(path+fname, std::ios::out | std::ios::app);
write << remove_header(chunk);
write.close();
//cut
std::string remove_header(std::string chunk){
if(chunk.find("")){
chunk = chunk.substr(chunk.find(""),chunk.length());
}
return chunk;
}
当我将我的代码下载的文件与文件wget downloads进行比较时,我的文件中只有一些NUL字符组成的较小的块,而且我的文件中也存在一些额外的字节。
有没有人有线索?
答案 0 :(得分:0)
是的,您指出的行是错误:
std::string temp(buffer,511);
//Is this the error^^^^^^^^^?
read()
返回实际读入缓冲区的字节数。你需要考虑到这一点:
std::string temp(buffer,n);
此外,您正在阅读原始数据,因此remove_header()
不属于write_chunk_to_file()
。缓冲区可以包含多个标题和/或主体的数据部分。您需要实现一个正确的HTTP解析器,以便您可以检测每个标头的结束位置,正文开始的位置,正文结束的位置以及正文的编码方式。然后你可以只将身体数据写入你的文件。
此代码甚至无法正确读取HTTP响应。你需要更像这样实现逻辑(我把它作为练习让你用C ++实现它):
send request
while true:
read line
if not successful:
throw error
if line is blank:
break while loop
add line to headers list
parse headers list
if response can contain message body:
if HTTP version is 1.1+, and Transfer-Encoding header is present and not "identity":
while true:
read line, extract delimited ASCII hexadecimal for the chunk size
if not successful:
throw error
if chunk size is 0:
break while loop
read chunk size number of bytes
while true:
read line
if not successful:
throw error
if line is blank:
break while loop
add line to headers list, replace existing header if needed
parse headers list again
else if Content-Length header is specified:
read Content-Length number of bytes
else if Content-Type header is "multipart/byteranges":
read and parse MIME-encoded chunks until terminating MIME boundary is reached
else:
read until connection is closed
答案 1 :(得分:0)
好了,现在改变了以下一行解决了它:
std::string temp(buffer,511);
//changed to:
std::string temp(buffer,n);
当我复制511字节时,我真的得到了“更多”,我只需要从socket读取read()读取的n个字节,感谢提示人员:D