如果元素的数量小于'n',则删除数组中的所有类似元素

时间:2016-04-21 02:45:59

标签: c# arrays

我有一个包含数千个元素的数组,其中许多元素与其他元素重复。我需要的是一种在数组中找到'foo'元素计数的方法,如果小于'n',则从数组中删除'foo'的所有元素。

我需要的例子

string[] words = new string[]
int n = 8;
int k = Occurances of "foo" in words;
if (k < n) {
    //Remove all occurances of 'foo' in the array
}

如果数组'words'中的起始元素是

{"foo","foo","foo","foo","foo","foo","foo","bar","bar","bar","bar","bar","bar","bar","bar","bar"}

结果将是数组中的左侧,因为只发现了7次“foo”,但发现了9次“bar”的发生

{"bar","bar","bar","bar","bar","bar","bar","bar","bar"}

感谢任何帮助

2 个答案:

答案 0 :(得分:3)

您可以使用LINQ GroupByCount来实现这一目标:

string[] words = new string[] { "foo", "foo", "foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar", "bar", "bar", "bar", "bar", "bar" };
int n = 8;
var groups = words.GroupBy(x => x).Where(g => g.Count() >= n);

你在这里做的是按元素值(foo组和条形图组)对元素进行分组,然后计算每个组,得到元素数大于特定阈值的组(在你的情况下n = 8)< / p>

要恢复数组,您可以使用SelectMany

进一步提升
string[] filteredWords = words.GroupBy(x => x).Where(g => g.Count() >= n)
    .SelectMany(g => g).ToArray();

答案 1 :(得分:1)

这样可以保留元素的原始顺序。

var words = new[]
{
    "foo", "foo", "foo", "foo", "foo",
    "foo", "foo", "bar", "bar", "bar",
    "bar", "bar", "bar", "bar", "bar",
    "bar"
};

var keepers = new HashSet<string>(
    words.ToLookup(x => x).Where(x => x.Skip(7).Any()).Select(x => x.Key));

words = words.Where(w => keepers.Contains(w)).ToArray();

如果订单不重要,那么这样做:

words =
    words
        .ToLookup(x => x)
        .Where(x => x.Skip(7).Any())
        .SelectMany(x => x)
        .ToArray();

根据你的评论,“是否有可能进一步扩展这一点,并检查字符串部分的出现?”,我认为你的意思是你要计算“单词”部分的个别频率如果满足频率要求,请保留整个“字”。这可能不太清楚。这是我的代码:

var words = new[]
{
    "foo", "foo", "foo extrabits", "foo", "foo",
    "foo", "foo", "bar", "bar", "bar",
    "bar", "bar", "bar extrabits", "bar", "bar",
    "bar"
};

var keepers =
    new HashSet<string>(
        words
            .SelectMany(x => x.Split(' '))
            .ToLookup(x => x)
            .Where(x => x.Skip(7).Any())
            .Select(x => x.Key));

words =
    words
        .Where(x => x.Split(' ').Any(y => keepers.Contains(y)))
        .ToArray();

这会产生:

bar 
bar 
bar 
bar 
bar 
bar extrabits 
bar 
bar 
bar