ienumerable <string> to dictionary <string,int =“”> </string,> </string>

时间:2012-11-21 05:50:30

标签: c# linq c#-4.0

我使用以下代码将字符串数组拆分为列表。

private List<string> GenerateTerms(string[] docs)
    {
        return docs.SelectMany(doc => ProcessDocument(doc)).Distinct().ToList();
    }

    private IEnumerable<string> ProcessDocument(string doc)
    {
        return doc.Split(' ')
                  .GroupBy(word => word)
                  .OrderByDescending(g => g.Count())
                  .Select(g => g.Key)
                  .Take(1000);
    }

我想要做的是用

替换返回的列表
Dictionary <string, int>

即。而不是返回列表,我想返回字典

有人可以帮忙吗?提前致谢。

4 个答案:

答案 0 :(得分:2)

string doc = "This is a test sentence with some words with some words repeating like: is a test";
var result = doc.Split(' ')
                   .GroupBy(word => word)
                   .OrderByDescending(g=> g.Count())
                   .Take(1000)
                   .ToDictionary(r => r.Key ,r=> r.Count());

修改

我相信您希望从字符串数组中获取最终字典,基于单词作为键并将其最终计数作为值。由于字典不能包含重复值,因此您不需要使用Distict。 您必须将您的方法重写为:

private Dictionary<string,int> GenerateTerms(string[] docs)
{
    List<Dictionary<string, int>> combinedDictionaryList = new List<Dictionary<string, int>>();
    foreach (string str in docs)
    {
        //Add returned dictionaries to a list
        combinedDictionaryList.Add(ProcessDocument(str));
    }
    //return a single dictionary from list od dictionaries
    return combinedDictionaryList
            .SelectMany(dict=> dict)
            .ToLookup(pair => pair.Key, pair => pair.Value)
            .ToDictionary(group => group.Key, group => group.Sum(value => value));
}

private Dictionary<string,int> ProcessDocument(string doc)
{
    return doc.Split(' ')
            .GroupBy(word => word)
            .OrderByDescending(g => g.Count())
            .Take(1000)
            .ToDictionary(r => r.Key, r => r.Count());
}

然后你可以这样称呼它:

string[] docs = new[] 
    {
        "This is a test sentence with some words with some words repeating like: is a test",
        "This is a test sentence with some words with some words repeating like: is a test",
        "This is a test sentence with some words",
        "This is a test sentence with some words",
    };

Dictionary<string, int> finalDictionary = GenerateTerms(docs);

答案 1 :(得分:1)

试试这个:

string[] docs = {"aaa bbb", "aaa ccc", "sss, ccc"};        

var result = docs.SelectMany(doc => doc.Split())
                 .GroupBy(word => word)
                 .OrderByDescending(g => g.Count())
                 .ToDictionary(g => g.Key, g => g.Count())
                 .Take(1000);

修改

var result = docs.SelectMany(
        doc => doc.Split()
            .GroupBy(word => word)
            .OrderByDescending(g => g.Count())
            .Take(1000))
    .Select(g => new {Word = g.Key, Cnt = g.Count()})
    .GroupBy(t => t.Word)
    .ToDictionary(g => g.Key, g => g.Sum(t => t.Cnt));

答案 2 :(得分:0)

如果没有任何额外的瑕疵,以下情况应该有效。

return doc.Split(' ')
          .GroupBy(word => word)
          .ToDictionary(g => g.Key, g => g.Count());

根据您的具体情况,通过TakeOrderBy等方式对其进行定制。

答案 3 :(得分:0)

尝试这样的事情:

    var keys = new List<string>();
    var values = new List<string>();
    var dictionary = keys.ToDictionary(x => x, x => values[keys.IndexOf(x)]);