如何在JSON对象中破坏大数据

时间:2019-02-22 01:30:34

标签: c# asp.net .net json asp.net-web-api

我想将大文件分块发送到Web API。文件将在JSON对象中包含数据。

条件:一个JSON对象永远不会超过1 MB大小。我的API一次从文件 中提取1 MB JSON内容,然后将1 MB分解为JSON对象。如果1 MB内还剩下一个不完整的JSON对象,则需要将其存储,并且在接收到下一个1 MB块时,不完整的JSON对象将被合并以形成完整的JSON对象,然后对其进行处理。

File Size: 1 GB
API Will receive : 1 MB
API needs to parse all the JSON object in 1 MB (as much as it can)
Incomplete JSON needs to be stored so that it can be merged in next 1 MB.

以下是我到目前为止所拥有的。

public async Task<bool> Upload()
{
    const int bufferSize = 1024*1024;
    var filesReadToProvider = await Request.Content.ReadAsMultipartAsync();
    foreach (var content in filesReadToProvider.Contents)
    {
        var stream = await content.ReadAsStreamAsync();
        using (StreamReader sr = new StreamReader(stream))
        {
            int dataRead; 
            char[] buffer = new char[bufferSize];
            dataRead = sr.ReadBlock(buffer, 0, bufferSize);
            //forloop
            var bteArr = Encoding.GetEncoding("UTF-8").GetBytes(buffer);
            while ((dataRead) > 0)
            {
                using (MemoryStream memoryStream = new MemoryStream(bteArr))
                {
                    try
                    {
                      // Process 1 COMPLETE JSON OBJECT out of many JSON's present in 1 MB
                    }

                }
                dataRead = sr.ReadBlock(buffer, 0, bufferSize);
            }
        }
    }
return true;
}

请指导我如何实现它。

尝试制造:

我确实做了新手的尝试,但这并不是万无一失的。在某些特殊情况下它仍然会中断,并且代码也太嘈杂。 :-|

public async Task<bool> Upload()
{
    const int bufferSize =  1024*1024;
    var filesReadToProvider = await Request.Content.ReadAsMultipartAsync();
    foreach (var content in filesReadToProvider.Contents)
    {
        var stream = await content.ReadAsStreamAsync();
        using (StreamReader sr = new StreamReader(stream))
        {
            int dataRead;
            char[] buffer = new char[bufferSize];
            char[] bufferToSend = new char[bufferSize];
            char[] stash = new char[bufferSize];
            // buffer getting all the content
            dataRead = sr.ReadBlock(buffer, 0, bufferSize);
            // finding index of where closing bracket in original buffer
            var index = Array.IndexOf(buffer, '}');
            // create the stash
            Array.Copy(buffer, index + 2, stash, 0, bytesRead - index - 2);
            // create the actual buffer to send.
            Array.Copy(buffer, 0, bufferToSend, 0, index + 1);
            // convert to byte to send.
            var bteArr = Encoding.GetEncoding("UTF-8").GetBytes(bufferToSend);
            while ((index) > 0)
            {
                using (MemoryStream memoryStream = new MemoryStream(bteArr))
                {
                // PROCESS JSON ObJects "bufferToSend"
                }
                // to end the loop
                if (index >= dataRead - 1)
                {
                    index = -1;
                    break;
                }
                // keep track of old index so that new stash can be created
                var oldindex = index;
                // increase the index to new place till where we need to create new buffertosend
                index = index + Array.IndexOf(stash, '}') + 2;
                // this is needed because if current payload is small then copy will keep the old payload intact
                bufferToSend = new char[bufferSize];
                // now copy the new stash content to buffer to send
                Array.Copy(stash, 0, bufferToSend, 0, index - oldindex - 1);
                // convert to bytearray
                bteArr = Encoding.GetEncoding("UTF-8").GetBytes(bufferToSend);
                // update the stash
                stash = new char[bufferSize];
                Array.Copy(buffer, index + 2, stash, 0, bytesRead - index - 2);
            }
            dataRead = sr.ReadBlock(buffer, 0, bufferSize);
        }
    }

return true;
}

输入:

  {
    "id": 5,
    "nm": "Edwy",
    "cty": "United Kingdom",
    "hse": "House of Wessex",
    "yrs": "955-959"
  },
  {
    "id": 6,
    "nm": "Edgar",
    "cty": "United Kingdom",
    "hse": "House of Wessex",
    "yrs": "959-975"
  }

预期输出。

  {
    "id": 5,
    "nm": "Edwy",
    "cty": "United Kingdom",
    "hse": "House of Wessex",
    "yrs": "955-959"
  }

然后进入下一个循环

{
    "id": 6,
    "nm": "Edgar",
    "cty": "United Kingdom",
    "hse": "House of Wessex",
    "yrs": "959-975"
  },

0 个答案:

没有答案