流式HTTP与GZIP由StreamReader缓冲?

时间:2013-02-07 20:22:58

标签: c# http httpwebrequest gzip http-streaming

努力寻找遇到类似问题或类似问题的人。

我目前正在使用具有GZip要求的http(json)流,我遇到从发送数据到reader.ReadLine()读取数据的延迟。有人向我建议,这可能与解码保持缓冲区中的数据有关吗?

这就是我目前的情况,除了延迟之外它还可以正常工作。

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(endPoint);
request.Method = "GET";

request.PreAuthenticate = true;
request.Credentials = new NetworkCredential(username, password);

request.AutomaticDecompression = DecompressionMethods.GZip;
request.ContentType = "application/json";
request.Accept = "application/json";
request.Timeout = 30;
request.BeginGetResponse(AsyncCallback, request);

然后在AsyncCallback方法中我有:

HttpWebRequest request = result.AsyncState as HttpWebRequest;

using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{

    while (!reader.EndOfStream)
    {
        string line = reader.ReadLine();
        if (string.IsNullOrWhiteSpace(line)) continue;

        Console.WriteLine(line);
    }
}

它只是位于reader.Readline(),直到收到更多数据,然后甚至阻止其中一些。收到了保持活跃的新行,当它决定阅读时,通常会立即读出这些新行。

我已经测试了运行curl命令并行运行的流,curl命令完全正确地接收和解压缩数据。

任何见解都会非常棒。 谢谢,

修改 在streamreader上使用缓冲区大小没有运气。

new StreamReader(stream, Encoding.UTF8, true, 1)

修改 也没有运气更新到.NET 4.5并使用

request.AllowReadStreamBuffering = false;

3 个答案:

答案 0 :(得分:5)

更新:这似乎在较长时间内存在较高的卷速率问题,并且只应在缓冲区影响应用程序功能的小卷上使用。我已经切换回StreamReader

所以这就是我最终想出来的。这没有延迟。这不会通过自动GZip解压缩来缓冲。

using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
using (Stream stream = response.GetResponseStream())
using (MemoryStream memory = new MemoryStream())
using (GZipStream gzip = new GZipStream(memory, CompressionMode.Decompress))
{
    byte[] compressedBuffer = new byte[8192];
    byte[] uncompressedBuffer = new byte[8192];
    List<byte> output = new List<byte>();

    while (stream.CanRead)
    {
        int readCount = stream.Read(compressedBuffer, 0, compressedBuffer.Length);

        memory.Write(compressedBuffer.Take(readCount).ToArray(), 0, readCount);
        memory.Position = 0;

        int uncompressedLength = gzip.Read(uncompressedBuffer, 0, uncompressedBuffer.Length);

        output.AddRange(uncompressedBuffer.Take(uncompressedLength));

        if (!output.Contains(0x0A)) continue;

        byte[] bytesToDecode = output.Take(output.LastIndexOf(0x0A) + 1).ToArray();
        string outputString = Encoding.UTF8.GetString(bytesToDecode);
        output.RemoveRange(0, bytesToDecode.Length);

        string[] lines = outputString.Split(new[] { Environment.NewLine }, new StringSplitOptions());
        for (int i = 0; i < (lines.Length - 1); i++)
        {
            Console.WriteLine(lines[i]);
        }

        memory.SetLength(0);
    }
}

答案 1 :(得分:1)

延迟确认可能会有一些东西C.Evenhuis讨论过,但我有一种奇怪的直觉,感觉这是导致你头疼的StreamReader ......你可能会尝试这样的事情:

public void AsyncCallback(IAsyncResult result)
{
    HttpWebRequest request = result.AsyncState as HttpWebRequest;   
    using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
    using (Stream stream = response.GetResponseStream())
    {
        var buffer = new byte[2048];
        while(stream.CanRead)
        {
            var readCount = stream.Read(buffer, 0, buffer.Length);
            var line = Encoding.UTF8.GetString(buffer.Take(readCount).ToArray());
            Console.WriteLine(line);
        }
    }
}
编辑:这是我用来测试这个理论的全部线束(可能与你的情况有所不同会跳出来)

(LINQPad就绪)

void Main()
{
    Task.Factory.StartNew(() => Listener());
    _blocker.WaitOne();
    Request();
}

public bool _running;
public ManualResetEvent _blocker = new ManualResetEvent(false);

public void Listener()
{
    var listener = new HttpListener();
    listener.Prefixes.Add("http://localhost:8080/");
    listener.Start();
    "Listener is listening...".Dump();;
    _running = true;
    _blocker.Set();
    var ctx = listener.GetContext();
    "Listener got context".Dump();
    ctx.Response.KeepAlive = true;
    ctx.Response.ContentType = "application/json";
    var outputStream = ctx.Response.OutputStream;
    using(var zipStream = new GZipStream(outputStream, CompressionMode.Compress))
    using(var writer = new StreamWriter(outputStream))
    {
        var lineCount = 0;
        while(_running && lineCount++ < 10)
        {
            writer.WriteLine("{ \"foo\": \"bar\"}");
            "Listener wrote line, taking a nap...".Dump();
            writer.Flush();
            Thread.Sleep(1000);
        }
    }
    listener.Stop();
}

public void Request()
{
    var endPoint = "http://localhost:8080";
    var username = "";
    var password = "";
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(endPoint);
    request.Method = "GET";

    request.PreAuthenticate = true;
    request.Credentials = new NetworkCredential(username, password);

    request.AutomaticDecompression = DecompressionMethods.GZip;
    request.ContentType = "application/json";
    request.Accept = "application/json";
    request.Timeout = 30;
    request.BeginGetResponse(AsyncCallback, request);
}

public void AsyncCallback(IAsyncResult result)
{
    Console.WriteLine("In AsyncCallback");    
    HttpWebRequest request = result.AsyncState as HttpWebRequest;    
    using (HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result))
    using (Stream stream = response.GetResponseStream())
    {
        while(stream.CanRead)
        {
            var buffer = new byte[2048];
            var readCount = stream.Read(buffer, 0, buffer.Length);
            var line = Encoding.UTF8.GetString(buffer.Take(readCount).ToArray());
            Console.WriteLine("Reader got:" + line);
        }
    }
}

输出:

Listener is listening...
Listener got context
Listener wrote line, taking a nap...
In AsyncCallback
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

Listener wrote line, taking a nap...
Reader got:{ "foo": "bar"}

答案 2 :(得分:0)

这可能与Delayed ACK结合Nagle算法有关。它发生在服务器连续发送多个小响应时。

在服务器端,发送第一个响应,但后续响应数据块仅在服务器收到客户端的ACK时发送,或者直到有足够的数据要发送大数据包(Nagle&#39; s算法)。

在客户端,收到第一个响应位,但不立即发送ACK - 因为传统应用程序具有请求 - 响应 - 请求 - 响应行为,它假定它可以发送ACK以及下一个请求 - 在你的情况下不会发生。

在一段固定的时间(500毫秒?)之后,它决定发送ACK,导致服务器发送它已经累积的下一个软件包。

通过设置NoDelay属性,禁用Nagle算法,可以在服务器端修复问题(如果这确实是您遇到的问题)。我想你也可以在操作系统上禁用它。

您还可以在客户端暂时禁用延迟ACK(我知道Windows有一个注册表项),看看这确实是问题,而不必更改服务器上的任何内容。延迟ACK可防止DDOS攻击,因此请确保之后恢复设置。

减少发送Keepalive也可能会有所帮助,但您仍有可能发生问题。