我正在进行类似标签的字符串匹配功能,其中函数检查是字符串包含任何可能的单词,同时保持其顺序,至少每个标签。 我发现最好预先创建可能性列表,并在检查时查看字符串是否包含每个必需的组合
也许代码会更清晰。
List<List<string[]>> tags;
List<string[]> innerList;
List<List<string>> combinationsList;
public void Generate(string pattern)
{
// i will add whitespace removal later so it can be ", " instead of only ","
foreach (string tag in pattern.Split(','))
{
innerList = new List<string[]>();
foreach (var varyword in tag.Split(' '))
{
innerList.Add(varyword.Split('|'));
}
}
// atm i lack code to generate combinations in form of List<List<string>>
// and drop them into 'combinationsList'
}
// the check function will look something like isMatch = :
public bool IsMatch(string textToTest)
{
return combinationsList.All(tc => tc.Any(c => textToTest.Contains(c)));
}
例如模式:
&#34;老|年轻的约翰|鲍勃,拥有|狗|猫&#34;
所以combinationList将有:
结果将是:
我无法弄清楚如何迭代集合以获得这些组合以及如何获得每次迭代的组合。 此外,我不能搞砸订单,所以老约翰也不会像约翰老一样生成。
请注意任何&#34;变体词&#34;在模式中可能有两个以上的变体,例如&#34; dog | cat | mouse&#34;
答案 0 :(得分:3)
答案 1 :(得分:2)
此代码可能会有所帮助
string pattern = "old|young john|bob have|posses dog|cat";
var lists = pattern.Split(' ').Select(p => p.Split('|'));
foreach (var line in CartesianProduct(lists))
{
Console.WriteLine(String.Join(" ",line));
}
//http://blogs.msdn.com/b/ericlippert/archive/2010/06/28/computing-a-cartesian-product-with-linq.aspx
static IEnumerable<IEnumerable<T>> CartesianProduct<T>(IEnumerable<IEnumerable<T>> sequences)
{
// base case:
IEnumerable<IEnumerable<T>> result = new[] { Enumerable.Empty<T>() };
foreach (var sequence in sequences)
{
var s = sequence; // don't close over the loop variable
// recursive case: use SelectMany to build the new product out of the old one
result =
from seq in result
from item in s
select seq.Concat(new[] { item });
}
return result;
}
答案 2 :(得分:0)
我在另一个帖子中找到了答案。
https://stackoverflow.com/a/11110641/1156272
Adam发布的代码完美无瑕,完全符合我的需要
foreach (var tag in pattern.Split(','))
{
string tg = tag;
while (tg.StartsWith(" ")) tg = tg.Remove(0,1);
innerList = new List<List<string>>();
foreach (var varyword in tg.Split(' '))
{
innerList.Add(varyword.Split('|').ToList<string>());
}
//Adam's code
List<String> combinations = new List<String>();
int n = innerList.Count;
int[] counter = new int[n];
int[] max = new int[n];
int combinationsCount = 1;
for (int i = 0; i < n; i++)
{
max[i] = innerList[i].Count;
combinationsCount *= max[i];
}
int nMinus1 = n - 1;
for (int j = combinationsCount; j > 0; j--)
{
StringBuilder builder = new StringBuilder();
for (int i = 0; i < n; i++)
{
builder.Append(innerList[i][counter[i]]);
if (i < n - 1) builder.Append(" "); //my addition to insert whitespace between words
}
combinations.Add(builder.ToString());
counter[nMinus1]++;
for (int i = nMinus1; i >= 0; i--)
{
// overflow check
if (counter[i] == max[i])
{
if (i > 0)
{
// carry to the left
counter[i] = 0;
counter[i - 1]++;
}
}
}
}
//end
if(combinations.Count > 0)
combinationsList.Add(combinations);
}
}
public bool IsMatch(string textToCheck)
{
if (combinationsList.Count == 0) return true;
string t = _caseSensitive ? textToCheck : textToCheck.ToLower();
return combinationsList.All(tg => tg.Any(c => t.Contains(c)));
}
看起来很神奇,但它确实有效。谢谢大家