我使用以下代码将字符串数组拆分为列表。
private List<string> GenerateTerms(string[] docs)
{
return docs.SelectMany(doc => ProcessDocument(doc)).Distinct().ToList();
}
private IEnumerable<string> ProcessDocument(string doc)
{
return doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.Take(1000);
}
我想要做的是用
替换返回的列表Dictionary <string, int>
即。而不是返回列表,我想返回字典
有人可以帮忙吗?提前致谢。
答案 0 :(得分:2)
string doc = "This is a test sentence with some words with some words repeating like: is a test";
var result = doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g=> g.Count())
.Take(1000)
.ToDictionary(r => r.Key ,r=> r.Count());
修改强>
我相信您希望从字符串数组中获取最终字典,基于单词作为键并将其最终计数作为值。由于字典不能包含重复值,因此您不需要使用Distict
。
您必须将您的方法重写为:
private Dictionary<string,int> GenerateTerms(string[] docs)
{
List<Dictionary<string, int>> combinedDictionaryList = new List<Dictionary<string, int>>();
foreach (string str in docs)
{
//Add returned dictionaries to a list
combinedDictionaryList.Add(ProcessDocument(str));
}
//return a single dictionary from list od dictionaries
return combinedDictionaryList
.SelectMany(dict=> dict)
.ToLookup(pair => pair.Key, pair => pair.Value)
.ToDictionary(group => group.Key, group => group.Sum(value => value));
}
private Dictionary<string,int> ProcessDocument(string doc)
{
return doc.Split(' ')
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Take(1000)
.ToDictionary(r => r.Key, r => r.Count());
}
然后你可以这样称呼它:
string[] docs = new[]
{
"This is a test sentence with some words with some words repeating like: is a test",
"This is a test sentence with some words with some words repeating like: is a test",
"This is a test sentence with some words",
"This is a test sentence with some words",
};
Dictionary<string, int> finalDictionary = GenerateTerms(docs);
答案 1 :(得分:1)
试试这个:
string[] docs = {"aaa bbb", "aaa ccc", "sss, ccc"};
var result = docs.SelectMany(doc => doc.Split())
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.ToDictionary(g => g.Key, g => g.Count())
.Take(1000);
修改强>
var result = docs.SelectMany(
doc => doc.Split()
.GroupBy(word => word)
.OrderByDescending(g => g.Count())
.Take(1000))
.Select(g => new {Word = g.Key, Cnt = g.Count()})
.GroupBy(t => t.Word)
.ToDictionary(g => g.Key, g => g.Sum(t => t.Cnt));
答案 2 :(得分:0)
如果没有任何额外的瑕疵,以下情况应该有效。
return doc.Split(' ')
.GroupBy(word => word)
.ToDictionary(g => g.Key, g => g.Count());
根据您的具体情况,通过Take
,OrderBy
等方式对其进行定制。
答案 3 :(得分:0)
尝试这样的事情:
var keys = new List<string>();
var values = new List<string>();
var dictionary = keys.ToDictionary(x => x, x => values[keys.IndexOf(x)]);