C套接字:recv(...)没有返回正确的字节

时间:2011-12-06 10:44:54

标签: c http sockets

如果我telnet到telnet www.xlhi.com 80,并应用以下GET请求:

GET http://www.xlhi.com/ HTTP/1.1
Host: www.xlhi.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Proxy-Connection: keep-alive
Cookie: CG=IE:04:Cork
Cache-Control: max-age=0

我收到以下回复:

HTTP/1.1 200 OK
Date: Tue, 06 Dec 2011 10:35:08 GMT
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.9
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48
Content-Type: text/html

��(�ͱ���I�O����H�����ч��
                          �4�@�

一切都很好并且符合预期。我对返回的gzip二进制数据感兴趣(“你好”)。

现在,我有这个C函数将GET请求应用于服务器(在本例中为www.xlhi.com)

char* applyGetReq(char* url,char* data,int len){
        int sockfd, numbytes;
        struct addrinfo hints, *servinfo, *p;
        int rv;
        char s[INET6_ADDRSTRLEN];

        memset(&hints, 0, sizeof hints);
        hints.ai_family = AF_UNSPEC;
        hints.ai_socktype = SOCK_STREAM;
        printf("Server name: %s\n\n",url);
        if ((rv = getaddrinfo(url,"80", &hints, &servinfo)) != 0) {
                fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv));
                exit(1);
        }

        // loop through all the results and connect to the first we can
        for(p = servinfo; p != NULL; p = p->ai_next) {
                if ((sockfd = socket(p->ai_family, p->ai_socktype,p->ai_protocol)) == -1) {
                        perror("client: socket");
                        continue;
                }
                if (connect(sockfd, p->ai_addr, p->ai_addrlen) == -1) {
                        close(sockfd);
                        perror("client: connect");
                        continue;
                }
                break;
        }

        if (p == NULL) {
                fprintf(stderr, "client: failed to connect\n");
                exit(1);
        }

        inet_ntop(p->ai_family, get_in_addr((struct sockaddr *)p->ai_addr),s, sizeof s);
        //printf("client: connecting to %s\n", s);

        sendall(sockfd,data,&len);

        freeaddrinfo(servinfo); // all done with this structure

        char* buf=malloc(MAXDATASIZE*sizeof(char));
        if ((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
                perror("recv");
                exit(1);
        }
        //printf("numbytes:%d\n",numbytes);
        buf[numbytes] = '\0';
        close(sockfd);
        return buf;
}

现在,当我调用此函数并打印出结果时:

    ...
    int len = strlen(data);   //data is a char[] and contains the exact same GET request as mentioned above
    char* buf=NULL;
    buf=applyGetReq(stripped_url,data,len);
    printf("%s\n",buf);

我从服务器得到以下响应:

HTTP/1.1 200 OK
Date: Tue, 06 Dec 2011 10:03:13 GMT
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.9
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48
Content-Type: text/html

�

正如您所看到的,由于某些无法解释的原因,页面内容(二进制数据)被缩短了。我应该得到:

��(�ͱ���I�O����H�����ч��
                              �4�@�

我现在已经看了两个小时了,似乎无法深究它,所以我想我会问社区。

1 个答案:

答案 0 :(得分:4)

这就是printf的工作方式。它在遇到NUL(0)字节时停止。尝试使用其他功能

fwrite(buf, 1, numbytes, stdout);