在匹配字符串C#时定义要忽略的字符

时间:2012-03-24 20:21:11

标签: c# .net algorithm

我希望将两个字符串相互匹配,同时注意三个条件:

1个案例敏感度(应该都不区分大小写):wHo< =>谁

2-underscore:father_of< =>

的父亲

3个缺失的空间:barackobama< =>巴拉克奥巴马

因此,以下两个字符串应该相互匹配

谁是barack_obama的父亲< =>谁是巴拉克奥巴马的父亲

我不知道从哪里开始,我试图得到两个字符串的排列,考虑下划线和缺少空格的情况,所以它就像

Who, is, fatherof, barack_obama

who is, is fatherof, fatherof barack_obama,
whois, isfatherof, fatherofbarack_obama,
who_is, is_fatherof, fatherof_barack_obama,

who is fatherof, is fatherof barack_obama
whoisfatherof, isfatherofbarack_obama
who_is_fatherof, is_fatherof_barack_obama

who is fatherof barack_obama
whoisfatherofbarack_obama
who_is_fatherof_barack_obama

这对于将barack obama与barack_obama匹配有好处,但反之亦然,即使我能够在其间拆分带有undserscore的字符串,我也不能用缺少的空间做到这一点

2 个答案:

答案 0 :(得分:6)

也许这样做:

public static class StringExtensions
{
  private string NormalizeText(string text)
  {
    return text..Replace("_","")
                .Replace(" ","")
                .Replace(",","");

  }

  public static bool CustomEquals(this string instance, string otherString)
  {
    return NormalizeText(instance).Equals(NormalizeText(otherString),
                                          StringComparison.CurrentCultureIgnoreCase);
  }
}

所以

"Who is the fatherof barack_obama"
"who IS the father of barack obama"

比较(忽略大小写)

"Whoisthefatherofbarackobama"
"whoISthefatherofbarackobama"

答案 1 :(得分:2)

用于删除字符的带有正则表达式的较短版本:

public static class StringExtensions
{
    public static bool CustomEquals(this string current, string other)
    {
        string pattern = @"[_\s,]";
        return String.Equals(
            Regex.Replace(current, pattern, String.Empty),
            Regex.Replace(other, pattern, String.Empty), 
            StringComparison.CurrentCultureIgnoreCase);
    }
}