比较两个字符串时如何不包括换行符

时间:2010-08-06 10:44:16

标签: c# string comparison

我正在比较两个字符串的更新。我做了一个:

 string1 != string2

他们结果不同。我把它们放在“添加观察”中,我看到唯一的区别是一个有换行符而另一个没有'。:

 string1 = "This is a test. \nThis is a test";
 string2 = "This is a test. This is a test";

我基本上想做一个比较,但不包括换行符。因此,如果换行是唯一的区别,那么请认为它们是相等的。

11 个答案:

答案 0 :(得分:9)

快速而肮脏的方式,当性能不是问题时:

string1.Replace("\n", "") != string2.Replace("\n", "")

答案 1 :(得分:3)

假设:

  1. 除了换行符之外,在这里需要对!=和==进行char-value-for-char值的直接比较。
  2. 字符串是,或者可能足够大或经常比较,只是用空字符串替换"\n"效率太低。
  3. 然后:

    public bool LinelessEquals(string x, string y)
    {
        //deal with quickly handlable cases quickly.
        if(ReferenceEquals(x, y))//same instance
            return true;         // - generally happens often in real code,
                                 //and is a fast check, so always worth doing first.
        //We already know they aren't both null as
        //ReferenceEquals(null, null) returns true.
        if(x == null || y == null)
            return false;
        IEnumerator<char> eX = x.Where(c => c != '\n').GetEnumerator();
        IEnumerator<char> eY = y.Where(c => c != '\n').GetEnumerator();
        while(eX.MoveNext())
        {
            if(!eY.MoveNext()) //y is shorter
                return false;
            if(ex.Current != ey.Current)
                return false;
        }
        return !ey.MoveNext(); //check if y was longer.
    }
    

    这被定义为平等而不是不平等,因此您可以轻松地将其调整为IEqualityComparer<string>.Equals的实现。您对无广告string1 != string2的问题变为:!LinelessEquals(string1, string2)

答案 2 :(得分:1)

我建议使用正则表达式将每个spacetab\r\n减少到一个空格:

Regex.Replace(string1, @"\s+", " ") != Regex.Replace(string2, @"\s+", " ")

答案 3 :(得分:1)

更清洁的方法是使用:

string1.Replace(Environment.NewLine, String.Empty) != string2.Replace(Environment.NewLine, String.Empty);

答案 4 :(得分:1)

这是Jon Hannas回答的广义和测试版本。

/// <summary>
/// Compares two character enumerables one character at a time, ignoring those specified.
/// </summary>
/// <param name="x"></param>
/// <param name="y"></param>
/// <param name="ignoreThese"> If not specified, the default is to ignore linefeed and newline: {'\r', '\n'} </param>
/// <returns></returns>
public static bool EqualsIgnoreSome(this IEnumerable<char> x, IEnumerable<char> y, params char[] ignoreThese)
{
    // First deal with quickly handlable cases quickly:
    // Same instance - generally happens often in real code, and is a fast check, so always worth doing first.
    if (ReferenceEquals(x, y))
        return true;         //
    // We already know they aren't both null as ReferenceEquals(null, null) returns true.
    if (x == null || y == null)
        return false;
    // Default ignore is newlines:
    if (ignoreThese == null || ignoreThese.Length == 0)
        ignoreThese = new char[] { '\r', '\n' };
    // Filters by specifying enumerator.
    IEnumerator<char> eX = x.Where(c => !ignoreThese.Contains(c)).GetEnumerator();
    IEnumerator<char> eY = y.Where(c => !ignoreThese.Contains(c)).GetEnumerator();
    // Compares.
    while (eX.MoveNext())
    {
        if (!eY.MoveNext()) //y is shorter
            return false;
        if (eX.Current != eY.Current)
            return false;
    }
    return !eY.MoveNext(); //check if y was longer.
}

答案 5 :(得分:1)

这是一个忽略某些字符的字符串的相等比较符,例如\r\n

此实现在执行期间不分配任何堆内存,从而有助于其性能。它还可以通过IEnumerableIEnumerator避免虚拟来电。

public sealed class SelectiveStringComparer : IEqualityComparer<string>
{
    private readonly string _ignoreChars;

    public SelectiveStringComparer(string ignoreChars = "\r\n")
    {
        _ignoreChars = ignoreChars;
    }

    public bool Equals(string x, string y)
    {
        if (ReferenceEquals(x, y))
            return true;
        if (x == null || y == null)
            return false;
        var ix = 0;
        var iy = 0;
        while (true)
        {
            while (ix < x.Length && _ignoreChars.IndexOf(x[ix]) != -1)
                ix++;
            while (iy < y.Length && _ignoreChars.IndexOf(y[iy]) != -1)
                iy++;
            if (ix >= x.Length)
                return iy >= y.Length;
            if (iy >= y.Length)
                return false;
            if (x[ix] != y[iy])
                return false;
            ix++;
            iy++;
        }
    }

    public int GetHashCode(string obj)
    {
        throw new NotSupportedException();
    }
}

答案 6 :(得分:0)

string1.replace('\n','') != string2.replace('\n','')

答案 7 :(得分:0)

你不能在比较字符串之前删除换行符吗?

E.g。 (伪码)...

string1.replace('\n','') != string2.replace('\n','')

答案 8 :(得分:0)

这是基于Drew Noakes答案的VB.net中的一个版本

Dim g_sIgnore As String = vbSpace & vbNewLine & vbTab 'String.Format("\n\r\t ")

Public Function StringCompareIgnoringWhitespace(s1 As String, s2 As String) As Boolean
    Dim i1 As Integer = 0
    Dim i2 As Integer = 0
    Dim s1l As Integer = s1.Length
    Dim s2l As Integer = s2.Length

    Do
        While i1 < s1l AndAlso g_sIgnore.IndexOf(s1(i1)) <> -1
            i1 += 1
        End While
        While i2 < s2l AndAlso g_sIgnore.IndexOf(s2(i2)) <> -1
            i2 += 1
        End While
        If i1 = s1l And i2 = s2l Then
            Return True
        Else
            If i1 < s1l AndAlso i2 < s2l AndAlso s1(i1) = s2(i2) Then
                i1 += 1
                i2 += 1
            Else
                Return False
            End If
        End If
    Loop
    Return False
End Function

我也用

测试了它
Try
    Debug.Assert(Not StringCompareIgnoringWhitespace("a", "z"))
    Debug.Assert(Not StringCompareIgnoringWhitespace("aa", "zz"))
    Debug.Assert(StringCompareIgnoringWhitespace("", ""))
    Debug.Assert(StringCompareIgnoringWhitespace(" ", ""))
    Debug.Assert(StringCompareIgnoringWhitespace("", " "))
    Debug.Assert(StringCompareIgnoringWhitespace(" a", "a "))
    Debug.Assert(StringCompareIgnoringWhitespace(" aa", "aa "))
    Debug.Assert(StringCompareIgnoringWhitespace(" aa ", " aa "))
    Debug.Assert(StringCompareIgnoringWhitespace(" aa a", " aa a"))
    Debug.Assert(Not StringCompareIgnoringWhitespace("a", ""))
    Debug.Assert(Not StringCompareIgnoringWhitespace("", "a"))
    Debug.Assert(Not StringCompareIgnoringWhitespace("ccc", ""))
    Debug.Assert(Not StringCompareIgnoringWhitespace("", "ccc"))
Catch ex As Exception
    Console.WriteLine(ex.ToString)
End Try

答案 9 :(得分:0)

在编写需要将多行预期字符串与实际输出字符串进行比较的单元测试时,我已经遇到了很多问题。

例如,如果我正在编写一种输出多行字符串的方法,则我关心的是每行的外观,但是我并不关心Windows或Mac计算机上使用的特定换行符。

>

在我的情况下,我只想断言我的单元测试中的每一行都是相等的,如果其中之一不行,则请保释。

public static void AssertAreLinesEqual(string expected, string actual)
{
    using (var expectedReader = new StringReader(expected))
    using (var actualReader = new StringReader(actual))
    {
        while (true)
        {
            var expectedLine = expectedReader.ReadLine();
            var actualLine = actualReader.ReadLine();

            Assert.AreEqual(expectedLine, actualLine);

            if(expectedLine == null || actualLine == null)
                break;
        }
    }
}

当然,您还可以使该方法更具通用性,并编写以返回bool的方式。

public static bool AreLinesEqual(string expected, string actual)
{
    using (var expectedReader = new StringReader(expected))
    using (var actualReader = new StringReader(actual))
    {
        while (true)
        {
            var expectedLine = expectedReader.ReadLine();
            var actualLine = actualReader.ReadLine();

            if (expectedLine != actualLine)
                return false;

            if(expectedLine == null || actualLine == null)
                break;
        }
    }

    return true;
}

最让我惊讶的是,我使用的任何单元测试框架中都没有这种方法。

答案 10 :(得分:0)

我在单元测试中遇到了行尾的问题。

//compare files ignoring line ends
    org.junit.Assert.assertEquals(
            read.readPayload("myFile.xml")
                    .replace("\n", "")
                    .replace("\r", ""),
            values.getFile()
                    .replace("\n", "")
                    .replace("\r", ""));

我通常不喜欢进行这种比较(比较整个文件),因为更好的方法是验证字段。但是它在这里回答了这个问题,因为它消除了大多数系统的行尾(replace调用是窍门)。

PS:read.readPayload从资源文件夹中读取文本文件并将其放入字符串中,values是一种结构,其中包含一个字符串,该字符串中包含文件的原始内容(如String)。它的属性。

PS2:未考虑任何性能,因为它只是单元测试的一个丑陋修补程序