将嵌套/复杂的JSON转换为CSV不会获得实际输出

时间:2019-12-29 12:23:09

标签: c# json csv

输入json是(json是真实数据的一小部分,真实json非常长且层次更大。json行超过30k)

  {
  "data": {
    "getUsers": [
      {
        "userProfileDetail": {
          "userStatus": {
            "name": "Expired"
          },
          "userStatusDate": "2017-04-04T07:48:25+00:00",
          "lastAttestationDate": "2019-02-01T03:50:42.6049634-05:00"
        },
        "userInformation": {
          "Id": 13610875,
          "lastName": "************",
          "suffix": null,
          "gender": "FEMALE",
          "birthDate": "1970-01-01T00:01:00+00:00",
          "ssn": "000000000",
          "ethnicity": "INVALID_REFERENCE_VALUE",
          "languagesSpoken": null,
          "personalEmail": null,
          "otherNames": null,
          "userType": {
            "name": "APN"
          },
          "primaryuserState": "CO",
          "otheruserState": [
            "CO"
          ],
          "practiceSetting": "INPATIENT_ONLY",
          "primaryEmail": "*****@*****.com"
        }
      },
      {
        "userProfileDetail": {
          "userStatus": {
            "name": "Expired newwwwwwwwwwww"
          },
          "userStatusDate": "2017-04-04T07:48:25+00:00",
          "lastAttestationDate": "2019-02-01T03:50:42.6049634-05:00"
        },
        "userInformation": {
          "Id": 13610875,
          "lastName": "************",
          "suffix": null,
          "gender": "FEMALE",
          "birthDate": "1970-01-01T00:01:00+00:00",
          "ssn": "000000000",
          "ethnicity": "INVALID_REFERENCE_VALUE",
          "languagesSpoken": null,
          "personalEmail": null,
          "otherNames": null,
          "userType": {
            "name": "APN"
          },
          "primaryuserState": "CO",
          "otheruserState": [
            "CO"
          ],
          "practiceSetting": "INPATIENT_ONLY",
          "primaryEmail": "*****@*****.com"
        }
      }
    ]
  }
}

代码是

var obj = JObject.Parse(json);
            // Collect column titles: all property names whose values are of type JValue, distinct, in order of encountering them.
            var jsonValues = obj.DescendantsAndSelf().OfType<JProperty>().Where(p => p.Value is JValue).GroupBy(p => p.Name).ToList();
            var jsonKey = jsonValues.Select(g => g.Key).ToArray();

            // Filter JObjects that have child objects that have values.
            var parentsWithChildren = jsonValues.SelectMany(g => g).SelectMany(v => v.AncestorsAndSelf().OfType<JObject>().Skip(1)).ToHashSet();

            // Collect all data rows: for every object, go through the column titles and get the value of that property in the closest ancestor or self that has a value of that name.
            var rows = obj
                .DescendantsAndSelf()
                .OfType<JObject>()
                .Where(o => o.PropertyValues().OfType<JValue>().Any() && (o == obj || !parentsWithChildren.Contains(o))) // Show a row for the root object + objects that have no children.
                .Select(o => jsonKey.Select(c => o.AncestorsAndSelf().OfType<JObject>().Select(parent => parent[c])
                    .Where(v => v is JValue).Select(v => (string)v).FirstOrDefault()).Reverse() // Trim trailing nulls
                    .SkipWhile(s => s == null).Reverse());

            // Convert to CSV
            var csvRows = new[] { jsonKey }.Concat(rows).Select(r => string.Join(",", r));
            var csv = string.Join("\n", csvRows);
            Console.WriteLine(csv);

这是我得到的输出:

  

getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail   过期,2017/04/04 13:18:25,02 / 01/2019 14:20:42   APN ,,, 13610875,************,女,1970年1月1日05:31:00,000000000,INVALID_REFERENCE_VALUE ,,, CO,INPATIENT_ONLY,***** @ *****。com

此处的userType> name not列不在正确的位置,otheruserState数组未出现在输出中。

有人可以帮助我吗?

2 个答案:

答案 0 :(得分:1)

以下是我建议的过程,因为它不会跳过null值,并且如果存在null也不会引发错误。下面的过程为json中的每个用户创建一个csv格式的字符串,并为任何null值写下一个string.empty。

字符串列表转换为|分隔,因为它以逗号分隔格式。 您应该更新所有类,并在属性名称中使用大写首字母。我只是粘贴从json2csharp网站获得的内容。

获取Json的课程

我使用json2csharp网站将您的json转换为类。一旦获得了类,我就在GetUser上使用了重写方法将用户数据转换为字符串。...然后使用该信息来打印它。

Json的课程


    public class UserStatus
    {
        public string name { get; set; }
    }

    public class UserProfileDetail
    {
        public UserStatus userStatus { get; set; }
        public DateTime userStatusDate { get; set; }
        public DateTime lastAttestationDate { get; set; }
    }

    public class UserType
    {
        public string name { get; set; }
    }

    public class UserInformation
    {
        public int Id { get; set; }
        public string lastName { get; set; }
        public string suffix { get; set; }
        public string gender { get; set; }
        public DateTime birthDate { get; set; }
        public string ssn { get; set; }
        public string ethnicity { get; set; }
        public List<string> languagesSpoken { get; set; }
        public string personalEmail { get; set; }
        public List<string> otherNames { get; set; }
        public UserType userType { get; set; }
        public string primaryuserState { get; set; }
        public List<string> otheruserState { get; set; }
        public string practiceSetting { get; set; }
        public string primaryEmail { get; set; }
    }

    public class GetUser
    {
        public override string ToString()
        {
            List<string> userData = new List<string>
            {
                userProfileDetail.userStatus.name,
                userProfileDetail.userStatusDate.ToString(),
                userProfileDetail.lastAttestationDate.ToString(),
                userInformation.Id.ToString(),
                userInformation.lastName,
                userInformation.suffix?? string.Empty ,
                userInformation.gender?? string.Empty ,
                userInformation.birthDate.ToString(),
                userInformation.ssn?? string.Empty ,
                userInformation.ethnicity?? string.Empty ,
                string.Join("|", userInformation.languagesSpoken?? new List<string>()),
                userInformation.personalEmail?? string.Empty ,
                string.Join("|", userInformation.otherNames?? new List<string>() ),
                userInformation.userType.name?? string.Empty ,
                userInformation.primaryuserState?? string.Empty ,
                string.Join("|", userInformation.otheruserState),
                userInformation.practiceSetting?? string.Empty ,
                userInformation.primaryEmail
            };

            return string.Join(",", userData);
        }
        public UserProfileDetail userProfileDetail { get; set; }
        public UserInformation userInformation { get; set; }
    }

    public class Data
    {
        public List<GetUser> getUsers { get; set; }
    }

    public class RootObject
    {
            public string GetHeader()
            {
                return "getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail";
            }
        public Data data { get; set; }
    }

如何使用上述类

    string json = File.ReadAllLines("locationOfJson");
    var rootObject = JsonConvert.DeserializeObject<RootObject>(json);
    Console.WriteLine(rootObject.GetHeader()); // Prints Header
    foreach (var user in rootObject.data.getUsers)
    {
        Console.WriteLine(user.ToString()); // Print Each User.
    }

输出

getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail
Expired,4/4/2017 3:48:25 AM,2/1/2019 3:50:42 AM,13610875,************,,FEMALE,12/31/1969 7:01:00 PM,000000000,INVALID_REFERENCE_VALUE,,,,APN,CO,CO,INPATIENT_ONLY,*****@*****.com

我建议复制粘贴到excel中的数据以查看其适合程度。我对其进行了测试,似乎所有数据在其标题下均正确无误。

答案 1 :(得分:0)

您提供的案例的解决方案如下。它使用JsonTextReader而不是LINQ to JSON来完全控制输出格式。例如,您没有指定字符串数组( otheruserState )的行为,因此在我的解决方案中,我用破折号分隔了字符串值。我使用空字符串表示空值。

string propertyName = "";
var isArray = false;
var arrayHeaderprinted = false;

var headers = new List<string>();
var data = new List<string>();
var arrayData = new List<string>();

using (var reader = new JsonTextReader(new StringReader(json)))
{
    while (reader.Read())
    {
        switch (reader.TokenType)
        {
            case JsonToken.PropertyName:
                propertyName = (string)reader.Value;
                break;
            case JsonToken.StartArray:
                isArray = true;
                break;
            case JsonToken.EndArray:
            case JsonToken.StartObject:
                isArray = false;
                if (arrayHeaderprinted)
                {
                    arrayHeaderprinted = false;
                    data.Add(string.Join("-", arrayData));
                }
                break;
            case JsonToken.Null:
            case JsonToken.String:
            case JsonToken.Boolean:
            case JsonToken.Date:
            case JsonToken.Float:
            case JsonToken.Integer:
                if (isArray)
                {
                    if (!arrayHeaderprinted)
                    {
                        arrayHeaderprinted = true;
                        headers.Add(propertyName);
                    }
                    arrayData.Add(reader.Value.ToString());
                }
                else
                {
                    headers.Add(propertyName);
                    data.Add(reader.Value?.ToString() ?? "");
                }
                break;
        }
    }
}

Console.WriteLine(string.Join(",", headers));
Console.WriteLine(string.Join(",", data));

它产生的输出:

name,userStatusDate,lastAttestationDate,Id,lastName,suffix,gender,birthDate,ssn,ethnicity,languagesSpoken,personalEmail,otherNames,name,primaryuserState,otheruserState,practiceSetting,primaryEmail
Expired,04.04.2017 09:48:25,01.02.2019 09:50:42,13610875,************,,FEMALE,01.01.1970 01:01:00,000000000,INVALID_REFERENCE_VALUE,,,,APN,CO,CO-PP,INPATIENT_ONLY,*****@*****.com