区分忽略空格,变音符和大小写的字符串列表

时间:2017-03-30 00:33:29

标签: c# .net unicode

给出以下字符串列表:

let directions: [UISwipeGestureRecognizerDirection] = [.up, .down, .right, .left]
for direction in directions {
    let gesture = UISwipeGestureRecognizer(target: self, action: #selector(YourClassName.handleSwipe(gesture:)))
    gesture.direction = direction
    self.view?.addGestureRecognizer(gesture)   
}

func handleSwipe(gesture: UISwipeGestureRecognizer) {
    print(gesture.direction)
    switch gesture.direction {
    case UISwipeGestureRecognizerDirection.down:
        print("down swipe")
    case UISwipeGestureRecognizerDirection.up:
        print("up swipe")
    case UISwipeGestureRecognizerDirection.left:
        print("left swipe")
    case UISwipeGestureRecognizerDirection.right:
        print("right swipe")
    default:
        print("other swipe")
    }
}

Distinct操作的结果应该是:

string[] Itens = new string[] { "hi", " hi   ", "HI", "hí", " Hî", "hi hi", " hí hí ", "olá", "OLÁ", " olá   ", "", "ola", "hola", " holà    ", "aaaa", "áâàa", " aâàa     ", "áaàa", "áâaa ", "aaaa ", "áâaa", "áâaa", };

C#的IEnumerable可用的独特操作接受IEqualityComparer作为参数,因此我们可以个性化比较。

以下实施完成工作

hi, hi hi, olá, , hola, aaaa

如果GetHashCode不同,则Equals甚至不执行,因此实现良好的实施非常重要。

我尝试将GetHashCode更改为其他两种不同的方法。

IgnoreHash

class LengthHash : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        if (x == null || y == null) return x == y;

        var xt = x.Trim();
        var yt = y.Trim();

        return xt.Length == yt.Length && Culture.CompareInfo.IndexOf(xt, yt, CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase) >= 0;
    }

    public int GetHashCode(string obj) => obj?.Trim().Length ?? 1;
}

NormalizedHash

public int GetHashCode(string obj) => 1;

除了使用个性化的IEqualityComparer之外,我还尝试在执行StringComparer.InvariantCultureIgnoreCase之前修剪列表,但它产生与Normalize和Upper版本相同的输出。

对纯Distinct,StringComparer.InvariantCultureIgnoreCase和3种个性化方法进行基准测试产生以下结果:

public int GetHashCode(string obj) => obj?.Trim().Normalize().ToUpperInvariant().GetHashCode() ?? 1;
// obs: This approach doesn't produce the same output.

输出结果为:

                              Method |       Mean |    StdErr |    StdDev |     Median |
------------------------------------ |----------- |---------- |---------- |----------- |
                          RunDefault |  2.2224 us | 0.0242 us | 0.2391 us |  2.1414 us |
                     RunHashAsLength |  6.0765 us | 0.0515 us | 0.1857 us |  6.1235 us |
                       RunIgnoreHash |  6.4078 us | 0.0640 us | 0.6140 us |  6.1982 us |
                   RunNormalizedHash | 14.5941 us | 0.0742 us | 0.3556 us | 14.4983 us |
 RunTrimAndCompareWithStringComparer | 14.4935 us | 0.0213 us | 0.0768 us | 14.5352 us |

您可以在https://gist.github.com/Flash3001/d50a6b43bba7bc61e3d85734e40dbed9

中找到完整的测试结果

问题是:有没有更好的方法来达到理想的最终名单?是一个不同的GetHashCode,Equals或其他预定义的IEqualityComparer。

1 个答案:

答案 0 :(得分:0)

您可以使用CompareInfo类,CompareGetHashCode提供的指定方法。这样,您可以确保实现是一致的。正确性至上。性能是次要的。

class StringEqualityComparer : IEqualityComparer<string>
{
    private CultureInfo _cultureInfo;
    private CompareOptions _options;
    private bool _trim;

    public StringEqualityComparer(CultureInfo cultureInfo,
        CompareOptions options, bool trim)
    {
        _cultureInfo = cultureInfo;
        _options = options;
        _trim = trim;
    }

    public bool Equals(string x, string y)
    {
        if (_trim) { x = x?.Trim(); y = y?.Trim(); }
        return _cultureInfo.CompareInfo.Compare(x, y, _options) == 0;
    }

    public int GetHashCode(string obj)
    {
        if (_trim) obj = obj?.Trim();
        return _cultureInfo.CompareInfo.GetHashCode(obj, _options);
    }
}

用法示例:

var comparer = new StringEqualityComparer(CultureInfo.InvariantCulture,
    CompareOptions.IgnoreNonSpace | CompareOptions.IgnoreCase, true);
var items = new string[] { "hi", " hi   ", "HI", "hí", " Hî", "hi hi", " hí hí ",
    "olá", "OLÁ", " olá   ", "", "ola", "hola", " holà    ", "aaaa", "áâàa",
    " aâàa     ", "áaàa", "áâaa ", "aaaa ", "áâaa", "áâaa", };
Console.WriteLine($"Distinct: {String.Join(", ", items.Distinct(comparer))}");

输出:

  

不同:嗨,嗨,olá,,hola,aaaa