C#中的自定义模式匹配

时间:2018-07-25 12:35:16

标签: c# formatting matching

目前,我正在基于文本制作AI。 我有一个数据库,每个答案都有一个模式 看起来像[谁是2018年世界杯冠军]

[] = Optional words

<> = Needed words

当我输入句子Who is the Winner of the World Cup 2018时,我的方法应返回答案的标识符。

我的数据库有2行,分别称为“ AnswerIndentifier”和“ Pattern”

1 个答案:

答案 0 :(得分:0)

我自己做的,并编写了该算法:

    private static bool MatchesPattern(string text, string pattern)
{
  List<string> patternTokens = new List<string>();
  string tok = "";
  pattern = pattern.ToLower() + "[";
  int state = 0;
  for(int i = 0; i < pattern.ToCharArray().Length; i++)
  {
    char token = pattern[i];
    if(token == '[')
    {
      if(tok != "")
      {
        patternTokens.Add("NEC" + char.MaxValue + tok);
        tok = "";
      }
      state = 1;
      continue;
    }
    if(token == ']' && state == 1)
    {
      i++;
      state = 0;
      patternTokens.Add("OPT" + char.MaxValue + tok);
      tok = "";
      continue;
    }
    if(token == ' ' && i + 1 < text.ToCharArray().Length && text[i + 1] == '[')
      continue;
    tok += token;
  }
  string[] patternTokensCopy = new string[patternTokens.Count];
  for(int i = 0; i < patternTokens.Count; i++)
    patternTokensCopy[i] = patternTokens[i];
  int max = (int) Math.Pow(2, patternTokens.Where(x => x.StartsWith("OPT")).ToList().Count);
  for(int i = 0; i < max; i++)
  {
    string binary = Convert.ToString(i, 2).PadLeft(patternTokensCopy.Where(x => x.StartsWith("OPT")).ToList().Count, '0');
    for(int x = 0; x < patternTokensCopy.Where(t => t.StartsWith("OPT")).ToList().Count; x++)
      if(binary[x] == '0')
      {
        List<string> optionalTokens = new List<string>();
        foreach(string curpattern in patternTokensCopy)
          if(curpattern.StartsWith("OPT"))
            optionalTokens.Add(curpattern);
        patternTokens.Remove(optionalTokens[x]);
      }
    string patternAsSentence = "";
    foreach(string patternToken in patternTokens)
      patternAsSentence += patternToken.Split(char.MaxValue)[1] + " ";
    while(patternAsSentence[patternAsSentence.Length - 1] == ' ')
      patternAsSentence = patternAsSentence.Substring(0, patternAsSentence.Length - 1);
    patternAsSentence = patternAsSentence.Replace("\r", "").Replace("  ", " ");
    int similarity = StringSimilarity.GetStringSimilarity(patternAsSentence, text);
    if(text.Length < 6)
    {
      if(text == patternAsSentence)
        return true;
    }
    else
    {
      if(similarity <= 6)
        return true;
    }
    patternTokens = new List<string>();
    patternTokensCopy.ToList().ForEach(x => patternTokens.Add(x));
  }

  return false;
}

唯一的变化是所需的文本不得标有<>和相似性检查(请参见C# - Compare String Similarity