我有一个字符串列表。我想提取列表中重复次数最多的单词。
例如:
List<string> mylist=new List<string>();
mylist.Add("book is good ");
mylist.Add("i like flowers ");
mylist.Add("i reading book");
我想要提取图书而不是我。
@ user3185569回复以下代码
List<string> mylist = new List<string>();
mylist.Add("book is good ");
mylist.Add("i like flowers ");
mylist.Add("i reading book");
var mostRepeatedWord = mylist.SelectMany(x => x.Split(new [] { " " },
StringSplitOptions.RemoveEmptyEntries))
.GroupBy(x => x).OrderByDescending(x => x.Count())
.Select(x => x.Key).FirstOrDefault();
但是这段代码提取了一个含有像,in等等词的单词
我想从我的列表中提取五个有意义的单词。我试图解决它,所以我在我的项目中添加了一个XML字典,其中包含 , 等字样。 并填写此词典的列表如下:
static List<string> notWord = new List<string>();
public static void fillList()
{
XmlDocument doc = new XmlDocument();
doc.Load(@"XMLDic.xml");
foreach (XmlNode node in doc.DocumentElement.ChildNodes)
{
notWord.Add(node.InnerText); //or loop through its children as well
}
}
首先,我从列表中删除了这些单词,之后,在五个循环中,提取mostRepeatedWord
并将其保存在新列表中。我从列表中删除mostRepeatedWord
,此过程再次重复5次。
public static List<string> finde(List<string> list)
{
List<string> newlist = new List<string>();
fillList();
delStr(list, "", true);
for (int i = 0; i < 6; i++)
{
var mostRepeatedWord = list.SelectMany(x => x.Split(new[] { " " },
StringSplitOptions.RemoveEmptyEntries))
.GroupBy(x => x).OrderByDescending(x => x.Count())
.Select(x => x.Key).FirstOrDefault();
if (mostRepeatedWord!="")
newlist.Add(mostRepeatedWord);
delStr(list, mostRepeatedWord, false);
}
return newlist;
}
删除list方法的单词是:
public static List<string> delStr(List<string> list, string str, bool t)
{
if (t)
{
string s;
for (int i = 0; i < list.Count; i++)
{
s = list[i];
foreach (var i1 in notWord)
{
s = s.Replace(i1, "");
}
list[i] = s;
}
}
else
{
string s;
for (int i = 0; i < list.Count; i++)
{
s = list[i];
s = s.Replace(str, "");
list[i] = s;
}
}
return list;
}
我想知道它是否正确或者,有更好的方法吗?
答案 0 :(得分:3)
使用Linq:
List<string> mylist = new List<string>();
mylist.Add("book is good ");
mylist.Add("i like flowers ");
mylist.Add("i reading book");
var mostRepeatedWord = mylist.SelectMany(x => x.Split(new [] { " " },
StringSplitOptions.RemoveEmptyEntries))
.GroupBy(x => x).OrderByDescending(x => x.Count())
.Select(x => x.Key).FirstOrDefault();
按空格分割:使用String.Split
。
将其展平为一个单词列表:使用SelectMany
。
GroupBy
。 OrderByDescending
和Count
。 FirstOrDefault
。