使用LINQ查找最频繁的单词

时间:2016-05-16 12:52:18

标签: c# linq

我一直试图从字符串列表中找到最常用的单词。我尝试过类似Find the most occurring number in a List<int>

的内容

但问题是它只返回一个单词,但所有这些单词都是最常见的

例如,如果我们在以下列表中调用该LINQ查询:

Dubai
Karachi
Lahore
Madrid
Dubai
Sydney
Sharjah
Lahore
Cairo

它应该导致我们:

ans:迪拜,拉合尔

5 个答案:

答案 0 :(得分:3)

使用分组依次按顺序排序:

var result = list
  .GroupBy(s => s)
  .Where(g=>g.Count()>1)
  .OrderByDescending(g => g.Count())
  .Select(g => g.Key);

答案 1 :(得分:2)

如果您需要重复发生的所有单词..

  List<string> list = new List<string>();
            list.Add("A");
            list.Add("A");
            list.Add("B");
            var most = (from i in list
                        group i by i into grp
                        orderby grp.Count() descending
                        select new { grp.Key, Cnt = grp.Count() }).Where (r=>r.Cnt>1);

答案 2 :(得分:1)

如果您想获得几个最常用的单词,可以使用以下方法:

public List<string> GetMostFrequentWords(List<string> list)
{
    var groups = list.GroupBy(x => x).Select(x => new { word = x.Key, Count = x.Count() }).OrderByDescending(x => x.Count);
    if (!groups.Any()) return new List<string>();

    var maxCount = groups.First().Count;

    return groups.Where(x => x.Count == maxCount).Select(x => x.word).OrderBy(x => x).ToList();
}

[TestMethod]
public void Test()
{
    var list = @"Dubai,Karachi,Lahore,Madrid,Dubai,Sydney,Sharjah,Lahore,Cairo".Split(',').ToList();
    var result = GetMostFrequentWords(list);

    Assert.AreEqual(2, result.Count);
    Assert.AreEqual("Dubai", result[0]);
    Assert.AreEqual("Lahore", result[1]);
}

答案 3 :(得分:1)

如果你只想要Dubai, Lahore (即只有最高出现的单词,样本中为2):

  List<String> list = new List<String>() {
   "Dubai", "Karachi", "Lahore", "Madrid", "Dubai", "Sydney", "Sharjah", "Lahore", "Cairo"
   };

  int count = -1;

  var result = list
    .GroupBy(s => s, s => 1)
    .Select(chunk => new {
      name = chunk.Key,
      count = chunk.Count()
     })
    .OrderByDescending(item => item.count)
    .ThenBy(item => item.name)
    .Where(item => {
      if (count < 0) {
        count = item.count; // side effects, alas (we don't know count a-priory)

        return true;
      }
      else
        return item.count == count;
    })
    .Select(item => item.name);

测试:

  // ans: Dubai, Lahore
  Console.Write("ans: " + String.Join(", ", result));

答案 4 :(得分:1)

我确信必须有更好的方法,但我设法做的一件事(可能会帮助你做出更优化)就像是跟着

public sealed partial class Page2 : Page
{

    private string x="";

    public Page2()
    {
        this.InitializeComponent();
    }



    protected override void OnNavigatedTo(NavigationEventArgs e)
    {
        x = e.Parameter as string;
        textBlock1.Text = x;
    }

    private void button_Click(object sender, RoutedEventArgs e)
    {
        this.Frame.Navigate(typeof(MainPage));
    }
}

这将列出那些发生频率最高的人,如果两个条目的发生频率相同,则两者都将被包括在内。

请注意,我们不会选择频率超过一次的条目。