如何从List <string []>获取所有可能的字符串组合?</string []>

时间:2013-04-26 14:14:46

标签: c# list iteration combinations

我正在进行类似标签的字符串匹配功能,其中函数检查是字符串包含任何可能的单词,同时保持其顺序,至少每个标签。 我发现最好预先创建可能性列表,并在检查时查看字符串是否包含每个必需的组合

也许代码会更清晰。

List<List<string[]>> tags;

List<string[]> innerList;

List<List<string>> combinationsList;

public void Generate(string pattern)
{
    // i will add whitespace removal later so it can be ", " instead of only ","

    foreach (string tag in pattern.Split(','))
    {
        innerList = new List<string[]>();

        foreach (var varyword in tag.Split(' '))
        {
            innerList.Add(varyword.Split('|'));
        }
    }

    // atm i lack code to generate combinations in form of List<List<string>> 
    // and drop them into 'combinationsList'
}

// the check function will look something like isMatch = :
public bool IsMatch(string textToTest)
{
    return combinationsList.All(tc => tc.Any(c => textToTest.Contains(c)));
}

例如模式:

&#34;老|年轻的约翰|鲍勃,拥有|狗|猫&#34;

  • 标签:
    • LIST_1:
      • {old,young}
      • {john,bob}
    • List_2
      • {have,posses}
      • {dog,cat}

所以combinationList将有:

  • 组合列表:
    • LIST_1
      • &#34;老约翰&#34;
      • &#34;老鲍勃&#34;
      • &#34;年轻的约翰&#34;
      • &#34;年轻的鲍勃&#34;
    • List_2
      • &#34;有狗&#34;
      • &#34;有猫&#34;
      • &#34;拥有狗&#34;
      • &#34;拥有猫&#34;

结果将是:

  • 老鲍勃有cat = true,包含List_1:&#34;老鲍勃&#34;和List_2:&#34;有cat&#34;
  • 年轻约翰有车=假,包含List_1:&#34;年轻的约翰&#34;但不包含任何List_2组合

我无法弄清楚如何迭代集合以获得这些组合以及如何获得每次迭代的组合。 此外,我不能搞砸订单,所以老约翰也不会像约翰老一样生成。

请注意任何&#34;变体词&#34;在模式中可能有两个以上的变体,例如&#34; dog | cat | mouse&#34;

3 个答案:

答案 0 :(得分:3)

几年前,我写了一篇关于如何解决这类问题的九篇系列博客文章。基本上你所拥有的是一个简单的上下文无关语法,你希望生成该CFG中的所有句子。

http://blogs.msdn.com/b/ericlippert/archive/tags/grammars/

答案 1 :(得分:2)

此代码可能会有所帮助

string pattern = "old|young john|bob have|posses dog|cat";
var lists = pattern.Split(' ').Select(p => p.Split('|'));

foreach (var line in CartesianProduct(lists))
{
    Console.WriteLine(String.Join(" ",line));
}


//http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
    // base case:
    IEnumerable<IEnumerable<T>> result = new[] { Enumerable.Empty<T>() };
    foreach (var sequence in sequences)
    {
        var s = sequence; // don't close over the loop variable
        // recursive case: use SelectMany to build the new product out of the old one
        result =
            from seq in result
            from item in s
            select seq.Concat(new[] { item });
    }
    return result;
}

答案 2 :(得分:0)

我在另一个帖子中找到了答案。

https://stackoverflow.com/a/11110641/1156272

Adam发布的代码完美无瑕,完全符合我的需要

        foreach (var tag in pattern.Split(','))
        {
            string tg = tag;
            while (tg.StartsWith(" ")) tg = tg.Remove(0,1);
            innerList = new List<List<string>>();

            foreach (var varyword in tg.Split(' '))
            {
                innerList.Add(varyword.Split('|').ToList<string>());
            }

            //Adam's code

            List<String> combinations = new List<String>();
            int n = innerList.Count;
            int[] counter = new int[n];
            int[] max = new int[n];
            int combinationsCount = 1;
            for (int i = 0; i < n; i++)
            {
                max[i] = innerList[i].Count;
                combinationsCount *= max[i];
            }
            int nMinus1 = n - 1;
            for (int j = combinationsCount; j > 0; j--)
            {
                StringBuilder builder = new StringBuilder();
                for (int i = 0; i < n; i++)
                {
                    builder.Append(innerList[i][counter[i]]);
                    if (i < n - 1) builder.Append(" "); //my addition to insert whitespace between words
                }
                combinations.Add(builder.ToString());

                counter[nMinus1]++;
                for (int i = nMinus1; i >= 0; i--)
                {
                    // overflow check
                    if (counter[i] == max[i])
                    {
                        if (i > 0)
                        {
                            // carry to the left
                            counter[i] = 0;
                            counter[i - 1]++;
                        }
                    }
                }
            }

            //end

            if(combinations.Count > 0)
                combinationsList.Add(combinations);
        }
    }

    public bool IsMatch(string textToCheck)
    {
        if (combinationsList.Count == 0) return true;

        string t = _caseSensitive ? textToCheck : textToCheck.ToLower();

        return combinationsList.All(tg => tg.Any(c => t.Contains(c)));
    }

看起来很神奇,但它确实有效。谢谢大家