如何确定我的收藏中出现的值最多?

时间:2015-08-14 22:34:45

标签: c# json

所以,我有一个包含水果列表的json文件。水果钥匙可以映射到单个水果或水果集合。

例如:

[
    {
        "fruits": [
            "banana"
        ]
    },
    {
        "fruits": [
            "apple"
        ]
    },
    {
        "fruits": [
            "orange",
            "apple"
        ]
    }
]

我想知道,我怎样才能确定哪些水果在我的json结构中出现的最多?也就是说,我怎么知道一个价值出现的频率以及哪个价值高于其他价值?

4 个答案:

答案 0 :(得分:4)

Not sure if you're interested in having a class to deserialize into, but here's how you would do it. Feel free to skip the class and use dynamic deserialization:

class FruitCollection
{
    string[] Fruits { get; set; }
}

var fruitColls = JsonConvert.DeserializeObject<FruitCollection>(json);
var mostCommon = fruitColls
    .SelectMany(fc => fc.Fruits)
    .GroupBy(f => f)
    .OrderByDescending(g => g.Count())
    .First()
    .Key;

EDIT:

This question's pretty old, but I'll mention that the OrderByDescending, First thing is doing redundant work: you don't really need to sort to get the maximum. This is an age-old lazy hack that people keep doing because LINQ does not provide a nice MaxBy extension method.

Usually your input size is small enough and the other stuff adds enough overhead that you don't really care, but the "correct" way (e.g. if you had billions of fruit types) would be to use a proper MaxBy extension method or hack something out of Aggregate. Finding the max is worst-case linear, whereas sorting is worst case O(n log(n)).

答案 1 :(得分:1)

If you use Json.NET, you can load your json using LINQ to JSON, then use SelectTokens to recursively find all "fruits" properties, then recursively collect all descendants string values (those of type JValue), group them by their string value, and put them in descending order:

        var token = JToken.Parse(jsonString);

        var fruits = token.SelectTokens("..fruits")  // Recursively find all "fruit" properties
            .SelectMany(f => f.DescendantsAndSelf()) // Recursively find all string literals undernearh each
            .OfType<JValue>()                        
            .GroupBy(f => (string)f)                 // Group by string value
            .OrderByDescending(g => g.Count())       // Descending order by count.
            .ToList();

Or, if you prefer to put your results into an anonymous type for clarity:

        var fruits = token.SelectTokens("..fruits")  // Recursively find all "fruit" properties
            .SelectMany(f => f.DescendantsAndSelf()) // Recursively find all string literals undernearh each
            .OfType<JValue>()
            .GroupBy(f => (string)f)                 // Group by string value
            .Select(g => new { Fruit = (string)g.Key, Count = g.Count() } )
            .OrderByDescending(f => f.Count)       // Descending order by count.
            .ToList();

Then afterwards:

        Console.WriteLine(JsonConvert.SerializeObject(fruits, Formatting.Indented));

Produces:

[
  {
    "Fruit": "apple",
    "Count": 2
  },
  {
    "Fruit": "banana",
    "Count": 1
  },
  {
    "Fruit": "orange",
    "Count": 1
  }
]

** Update **

Forgot to include the following extension method

public static class JsonExtensions
{
    public static IEnumerable<JToken> DescendantsAndSelf(this JToken node)
    {
        if (node == null)
            return Enumerable.Empty<JToken>();
        var container = node as JContainer;
        if (container != null)
            return container.DescendantsAndSelf();
        else
            return new [] { node };
    }
}

The original question was a little vague on the precise structure of the JSON which is why I suggested using Linq rather than deserialization.

答案 2 :(得分:0)

The serialization class for this structure is simple:

public class RootObject
{
    public List<List<string>> fruits { get; set; }
}

So to deserialize:

var fruitListContainer = JsonConvert.DeserializeObject<RootObject>(jsonString);

Then you can put all fruits in one list:

List<string> fruits = fruitListContainer.fruits.SelectMany(f => f);

Now you have all fruits in one list, and you can do whatever you want. For sorting, see the other answers.

答案 3 :(得分:0)

假设数据位于名为fruits.json的文件中,则jq(http://stedolan.github.io/jq/)位于PATH上,并且您使用的是Mac或Linux风格的shell:

$ jq 'reduce (.[].fruits[]) as $fruit ({}; .[$fruit] += 1)' fruits.json
{
  "banana": 1,
  "apple": 2,
  "orange": 1
}

在Windows上,如果适当调整引号,同样的事情将起作用。或者,如果将单行jq程序放在文件中,例如fruits.jq,则可以在任何支持的环境中运行以下命令:

jq -f fruits.jq fruits.json

如果数据来自其他某个进程,您可以将其传输到jq,例如像这样:

jq -f fruits.jq

找到最大数量的一种方法是添加几个过滤器,例如如下:

$ jq 'reduce (.[].fruits[]) as $fruit ({}; .[$fruit] += 1) |
      to_entries | max_by(.value)' fruits.json
{
  "key": "apple",
  "value": 2
}