可能重复:
how to recognize similar words with difference in spelling
我在尝试比较这3个字符串时返回true: 'voest','vost'和'vöst'(德国文化),因为它是同一个词。 (事实上,只有oe和ö是相同的,但是例如对于DB校对CI它是正确的,因为'vost'是错误的'voest')
无论我为该方法提供什么参数,string.Compare(..)/ string.Equals(..)都返回false。
如何使string.Compare()/ Equals(..)返回true?
答案 0 :(得分:5)
您可以创建一个忽略变音符号的自定义比较器:
class IgnoreUmlautComparer : IEqualityComparer<string>
{
Dictionary<char, char> umlautReplacer = new Dictionary<char, char>()
{
{'ä','a'}, {'Ä','A'},
{'ö','o'}, {'Ö','O'},
{'ü','u'}, {'Ü','U'},
};
Dictionary<string, string> pseudoUmlautReplacer = new Dictionary<string, string>()
{
{"ae","a"}, {"Ae","A"},
{"oe","o"}, {"Oe","O"},
{"ue","u"}, {"Ue","U"},
};
private IEnumerable<char> ignoreUmlaut(string s)
{
char value;
string replaced = new string(s.Select(c => umlautReplacer.TryGetValue(c, out value) ? value : c).ToArray());
foreach (var kv in pseudoUmlautReplacer)
replaced = replaced.Replace(kv.Key, kv.Value);
return replaced;
}
public bool Equals(string x, string y)
{
var xChars = ignoreUmlaut(x);
var yChars = ignoreUmlaut(y);
return xChars.SequenceEqual(yChars);
}
public int GetHashCode(string obj)
{
return ignoreUmlaut(obj).GetHashCode();
}
}
现在,您可以将此比较器与Distinct
之类的Enumerable
方法一起使用:
string[] allStrings = new[]{"voest","vost","vöst"};
bool allEqual = allStrings.Distinct(new IgnoreUmlautComparer()).Count() == 1;
// --> true
答案 1 :(得分:0)
您可以在比较中尝试IgnoreNonSpace选项。它不会解决voest - vost,但会对vost-vöst有帮助。
int a = new CultureInfo("de-DE").CompareInfo.Compare("vost", "vöst", CompareOptions.IgnoreNonSpace);
// a = 0; strings are equal.