找出一个字符串列表是否包含另一个字符串的单词排列(每个组合的计数器)

时间:2019-02-05 15:58:37

标签: c# .net string

我不知道该如何更好地提出这个问题,因此我将尽力解释。

假设我有一个包含20个字符串myList1<string>的列表,而我还有另一个字符串string ToCompare。现在,列表中的每个字符串以及string ToCompare都有8个单词,中间用空格隔开。我想知道在string ToCompare的字符串中可以找到来自myList1<string>的任何三个单词以任何可能的顺序组合多少次。例如:

这是列表(简短版本-示例):

string1 = "AA BB CC DD EE FF GG HH";
string2 = "BB DD EE AA HH II JJ MM";
.......
string20 = "NN OO AA RR EE BB FF KK";

string ToCompare = "BB GG AA FF CC MM RR II";

现在,我想知道在ToCompare中发现来自myList1<string>字符串的3个单词的任何组合有多少次。为了进一步说明在列表的{{1}中找到ToCompare "BB AA CC"中的三个单词,因此这3个单词的计数器为1。另外string1 {{1 }}位于ToCompare的{​​{1}}中,但此处的计数器也为1,因为它不是相同的单词组合(我有"BB AA II"string2,还有{ {1}}。它们不相等。这3个单词的顺序无关紧要,这意味着myList1<string>。我想知道在"AA"中找到了"BB"中所有(任何)3个单词的组合。我希望我的意思很清楚。

任何帮助将不胜感激,我不知道如何解决此问题。谢谢。

Vanest的示例:

"II"

string ToCompare =“ 2 4 6 15 20 22 28 44”;

其余代码完全相同,结果:

"AA BB CC" = "BB AA CC" = "CC BB AA"

如您所见,字符串中不存在某些组合,第一个组合的值为2,但在第一个字符串中仅出现一次

2 个答案:

答案 0 :(得分:1)

我认为这将满足您的要求:

void Main()
{
    var list = 
        new List<String> 
        {
            "AA BB CC DD EE FF GG HH",
            "BB DD EE AA HH II JJ MM",
            "NN OO AA RR EE BB FF KK"
        };

    var toCompare = "BB GG AA FF CC MM RR II";

    var permutations = CountPermutations(list, toCompare);
}

public Int32 CountPermutations(List<String> list, String compare)
{
    var words = compare.Split(' ');

    return list
        .Select(l => l.Split(' '))
        .Select(l => new { String = String.Join(" ", l), Count = l.Join(words, li => li, wi => wi, (li, wi) => li).Count()})
        .Sum(x => x.Count - 3);
}

[编辑:2/20/2019]

您可以使用以下内容获取每个列表项的所有匹配项以及唯一组合的总数

void Main()
{
    var list =
        new List<String>
        {
            "AA BB CC DD EE FF GG HH",
            "BB DD EE AA HH II JJ MM",
            "NN OO AA RR EE BB FF KK",
            "AA AA CC DD EE FF GG HH"
        };

    list.Select((l, i) => new { Index = i, Item = l }).ToList().ForEach(x => Console.WriteLine($"List Item{x.Index + 1}: {x.Item}"));

    var toCompare = "BB GG AA FF CC MM RR II";

    Console.WriteLine($"To Compare: {toCompare}");

    Func<Int32, Int32> Factorial = x => x < 0 ? -1 : x == 0 || x == 1 ? 1 : Enumerable.Range(1, x).Aggregate((c, v) => c * v);

    var words = toCompare.Split(' ');

    var matches = list
        // Get a list of the list items with all their parts
        .Select(l => new { Parts = l.Split(' '), Original = l })
        // Join each part from the to-compare item to each part of the list item
        .Select(l => new { String = String.Join(" ", l), Matches = l.Parts.Join(words, li => li, wi => wi, (li, wi) => li), l.Original })
        // Only consider items with at least 3 matches
        .Where(l => l.Matches.Count() >= 3)
        // Get the each item including how many parts matched and how many unique parts there are of each part
        .Select(l => new { l.Original, Matches = String.Join(" ", l.Matches), Count = l.Matches.Count(), Groups = l.Matches.GroupBy(m => m).Select(m => m.Count()) })
        // To calculate the unique combinations for each match use the following mathematical equation: match_count! / (frequency_part_1! * frequency_part_2! * ... * frequency_part_n!)
        .Select(l => new { l.Original, l.Matches, Combinations = Factorial(l.Count) / l.Groups.Aggregate((c, v) => c * Factorial(v)) })
        .ToList();

    matches.ForEach(m => Console.WriteLine($"Original: {m.Original}, Matches: {m.Matches}, Combinations: {m.Combinations}"));

    var totalUniqueCombinations = matches.Sum(x => x.Combinations);

    Console.WriteLine($"Total Unique Combinations: {totalUniqueCombinations}");

}

答案 1 :(得分:1)

我认为这足以满足您的要求,

List<string> source = new List<string>();
source.Add("AA BB CC DD EE FF GG HH");
source.Add("BB DD EE AA HH II JJ MM");
source.Add("NN OO AA RR EE BB FF KK");

string ToCompare = "BB GG AA FF CC MM RR II";

string word1, word2, word3, existingKey;
string[] compareList = ToCompare.Split(new string[] { " " }, StringSplitOptions.None);
Dictionary<string, int> ResultDictionary = new Dictionary<string, int>();
for (int i = 0; i < compareList.Length - 2; i++)
{
    word1 = compareList[i];
    for (int j = i + 1; j < compareList.Length - 1; j++)
    {
        word2 = compareList[j];
        for (int z = j + 1; z < compareList.Length; z++)
        {
            word3 = compareList[z];
            source.ForEach(x =>
            {
                if (x.Contains(word1) && x.Contains(word2) && x.Contains(word3))
                {
                    existingKey = ResultDictionary.Keys.FirstOrDefault(y => y.Contains(word1) && y.Contains(word2) && y.Contains(word3));
                    if (string.IsNullOrEmpty(existingKey))
                    {
                        ResultDictionary.Add(word1 + " " + word2 + " " + word3, 1);
                    }
                    else
                    {
                        ResultDictionary[existingKey]++;
                    }
                }
            });
        }
    }
}

ResultDictionary将包含myList1<string>中出现的3个单词组合及其出现次数。要获取总数,请从ResultDictionary中检索并添加所有值字段。

编辑:

下面的代码段在给定的输入下会产生正确的结果

List<string> source = new List<string>();
source.Add("2 4 6 8 10 12 14 99");
source.Add("16 18 20 22 24 26 28 102");
source.Add("33 6 97 38 50 34 87 88");

string ToCompare = "2 4 6 15 20 22 28 44";

string word1, word2, word3, existingKey;
string[] compareList = ToCompare.Split(new string[] { " " }, StringSplitOptions.None);
string[] sourceList, keywordList;
Dictionary<string, int> ResultDictionary = new Dictionary<string, int>();
source.ForEach(x =>
{
    sourceList = x.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
    for (int i = 0; i < compareList.Length - 2; i++)
    {
        word1 = compareList[i];
        for (int j = i + 1; j < compareList.Length - 1; j++)
        {
            word2 = compareList[j];
            for (int z = j + 1; z < compareList.Length; z++)
            {
                word3 = compareList[z];
                if (sourceList.Contains(word1) && sourceList.Contains(word2) && sourceList.Contains(word3))
                {
                    existingKey = ResultDictionary.Keys.FirstOrDefault(y =>
                                  {
                                      keywordList = y.Split(new string[] { " " }, StringSplitOptions.None);
                                      return keywordList.Contains(word1) && keywordList.Contains(word2) && keywordList.Contains(word3);
                                  });
                    if (string.IsNullOrEmpty(existingKey))
                    {
                        ResultDictionary.Add(word1 + " " + word2 + " " + word3, 1);
                    }
                    else
                    {
                        ResultDictionary[existingKey]++;
                    }
                }
            }
        }
    }
});

希望这对您有帮助...