为什么在调用LastIndexOf时会忽略某些字符?

时间:2019-03-17 20:50:40

标签: c# string replace

请查看以下代码:

string str_to_find = "➖➖➖➖➖➖➖➖➖➖\r\n";
string str = "Nancy" + str_to_find;
if (str.EndsWith(str_to_find)) {
    str = Remove_Last_String(str, str_to_find);
}

这是方法:

public static string Remove_Last_String(string Source, string Find) {
    int i = Source.LastIndexOf(Find);
    if (i >= 0) {
        string new_str = Source.Substring(0, i);
        return new_str;
    }
    else return Source;
}

我需要Nancy作为输出。
但是方法返回:
Nancy➖➖➖➖➖➖➖➖➖➖
这些奇怪字符有什么问题,我该如何解决?

3 个答案:

答案 0 :(得分:4)

您搞砸了不寻常的Unicode字符。也许他们在和你搞混。请始终指定字符串比较样式。在您的代码中使用它:

int i = Source.LastIndexOf(Find, StringComparison.Ordinal);

StringComparison.Ordinal强制比较字符串以忽略当前区域性设置。显然,区域性设置使算法的行为不同于您/我们想要/期望的行为。

答案 1 :(得分:3)

The docs state:

Character sets include ignorable characters, which are characters that are not considered when performing a linguistic or culture-sensitive comparison. In a culture-sensitive search, if value contains an ignorable character, the result is equivalent to searching with that character removed.

➖ is an ignorable character, which explains why searching for "\r\n" or "y➖➖➖➖➖➖➖➖➖➖\r\n" behaves 'as expected', while "➖➖➖➖➖➖➖➖➖➖\r\n" does not.

Using StringComparison.Ordinal, as shown by @AlKepp, will solve the issue since then the comparison is not culture sensitive.

See also List of ignorable characters for string comparison.

答案 2 :(得分:2)

using System;

public class Program
{
    public static void Main()
    {
        string str_to_find = "➖➖➖➖➖➖➖➖➖➖\r\n";
        string str = "Nancy" + str_to_find;
        if (str.EndsWith(str_to_find)) {
            str = Remove_Last_String(str, str_to_find);
            Console.WriteLine(str);
        }
    }

    public static string Remove_Last_String(string Source, string Find) {


    int i = Find.Length;
    int j = Source.Length;
    if (i >= 0) {
        string new_str = Source.Substring(0, j-i);
        return new_str;
    }
    else return Source;
    }
}