将Json转换为Newline Delimit json

时间:2018-02-01 11:14:40

标签: google-bigquery

我需要将我的json转换为Newline分隔符,以便从C#(.NET应用程序)在BigQuery中插入数据。 请建议解决方法

输入

[  
   {  
      "DashboardCategoryId":1,
      "BookingWindows":[  
         {  
            "DaysRange":"31-60 Days",
            "BookingNumber":2
         },
         {  
            "DaysRange":"Greater Than 1 year",
            "BookingNumber":1
         }
      ]
   },
   {  
      "DashboardCategoryId":1,

      "BookingWindows":[  
         {  
            "DaysRange":"61-120 Days",
            "BookingNumber":1
         },
         {  
            "DaysRange":"8-14",
            "BookingNumber":1
         }
      ]
   }
]

必需的输出

{"DashboardCategoryId": 1,"BookingWindows": [{"DaysRange": "31-60 Days","BookingNumber":2},{"DaysRange": "Greater Than 1 year","BookingNumber": 1}]}
 {"DashboardCategoryId": 1,"BookingWindows": [{"DaysRange": "61-120 Days","BookingNumber":1},{"DaysRange": "8-14","BookingNumber": 1}]}

2 个答案:

答案 0 :(得分:1)

如果您已将JSON数组加载到内存中,例如List<JToken>,则可以使用 Serialize as NDJSON using Json.NET 中的答案将其写入换行符分隔的JSON。

然而,由于BigQuery换行符分隔的JSON文件确实往往很大,我建议改为完全流式解决方案:

public static class JsonExtensions
{
    public static void ToNewlineDelimitedJson(Stream readStream, Stream writeStream)
    {
        var encoding = new UTF8Encoding(false, true);

        // Let caller dispose the underlying streams.
        using (var textReader = new StreamReader(readStream, encoding, true, 1024, true))
        using (var textWriter = new StreamWriter(writeStream, encoding, 1024, true))
        {
            ToNewlineDelimitedJson(textReader, textWriter);
        }
    }

    public static void ToNewlineDelimitedJson(TextReader textReader, TextWriter textWriter)
    {
        using (var jsonReader = new JsonTextReader(textReader) { CloseInput = false, DateParseHandling = DateParseHandling.None })
        {
            ToNewlineDelimitedJson(jsonReader, textWriter);
        }
    }

    enum State { BeforeArray, InArray, AfterArray };

    public static void ToNewlineDelimitedJson(JsonReader jsonReader, TextWriter textWriter)
    {
        var state = State.BeforeArray;
        do
        {
            if (jsonReader.TokenType == JsonToken.Comment || jsonReader.TokenType == JsonToken.None || jsonReader.TokenType == JsonToken.Undefined || jsonReader.TokenType == JsonToken.PropertyName)
            {
                // Do nothing
            }
            else if (state == State.BeforeArray && jsonReader.TokenType == JsonToken.StartArray)
            {
                state = State.InArray;
            }
            else if (state == State.InArray && jsonReader.TokenType == JsonToken.EndArray)
            {
                state = State.AfterArray;
            }
            else
            {
                // Formatting.None is the default; I set it here for clarity.
                using (var jsonWriter = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
                {
                    jsonWriter.WriteToken(jsonReader);
                }
                // http://specs.okfnlabs.org/ndjson/
                // Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). 
                // The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
                textWriter.Write("\n");

                // Root value wasn't an array after all, so end writing with one item.
                if (state == State.BeforeArray)
                    state = State.AfterArray;
            }
        }
        while (jsonReader.Read() && state != State.AfterArray);
    }
}

然后按如下方式使用:

using (var readStream = File.OpenRead(fromFileName))
using (var writeStream = File.Open(toFileName, FileMode.Create))
{
    JsonExtensions.ToNewlineDelimitedJson(readStream, writeStream);
}

这利用方法JsonWriter.WriteToken(JsonReader)直接从JsonReader编写和格式化到JsonWriter,而无需将整个JSON令牌层次结构加载到内存中。

工作样本.Net fiddle

答案 1 :(得分:0)

Newtonsoft Json.NET可用于格式化JSON。 我找到了示例here

private static string FormatJson(string json)
{
    dynamic parsedJson = JsonConvert.DeserializeObject(json);
    return JsonConvert.SerializeObject(parsedJson, Formatting.Indented);
}