Question

我有一份SSRS（2008 r2）报告输出到PDF。该报告采用一个字符串（最初为HTML格式）并使用自定义VB函数使用正则表达式删除HTML，空格和XML字符。问题是我仍然在结果字符串中留下了一个白盒字符。它看起来像以下符号：

□

我的VB功能如下：

Public Shared Function GetNotes(ByVal strNotes As String) As SqlString
    ' Gets notes within HTML tags
    Dim s As String

    Try
        s = System.Text.RegularExpressions.Regex.Replace(strNotes, "<.*?\n?.*?>", " ")
        s = System.Text.RegularExpressions.Regex.Replace(s, " +", " ")
        s = System.Text.RegularExpressions.Regex.Replace(s, "<[^<>]*?>", " ")
        s = System.Text.RegularExpressions.Regex.Replace(s, "[\t\r\n] ", "")
        s = s.Replace("&amp;", "&")
        s = s.Replace("&nbsp;", "")
        s = s.Trim()
    Catch ex As Exception
        Return New SqlString("Description:  ")
    End Try
    Return New SqlString(s)
End Function

我应该添加什么来删除此白盒？

Answer 1

根据您的评论，该字符仅出现在字符串的末尾。

您可以轻松地将TrimEnd用于此目的：

Dim s As String = "Some text with □"
s = s.TrimEnd("□")

或许，这也可以（因为方框是\u25A1字符）：

s = Regex.Replace(s, "[\u25A1]", String.Empty)

输出：

enter image description here

VB.Net Regex xml whitebox ssrs pdf

1 个答案: