我正在尝试使用VB.NET(3.5)中的正则表达式从输入字符串中删除所有非ascii字符。我有一个函数应该通过正则表达式运行任何输入字符串:
Public Shared Function RemoveIllegalCharacters(ByVal inpTxt As String) As String
'use a regular expression to replace any characters that are non-ascii
inpTxt = Regex.Replace(inpTxt, "[^\u0000-\u007F]", String.Empty)
Return inpTxt
End Function
这似乎在功能中正常工作。在整个函数中,inpTxt =“123foobar”是“123foobar”。但是,当我在其他地方访问它时:
Public someOtherFunction(ByVal inpTxt As String) As String
inpTxt = RemoveIllegalCharacters(inpTxt)
Return inpTxt
End Function
第一个字符消失:
inpTxt = "23foobar"
其他消息来源建议我写
inpTxt = Regex.Replace(inpTxt, @"[^\u0000-\u007F]", String.Empty)
但是项目拒绝在没有字符串的情况下编译Regex.Replace的第二个参数。
答案 0 :(得分:2)
这个功能毫无意义:
Public Shared Sub RemoveIllegalCharacters(ByVal inpTxt As String)
'use a regular expression to replace any characters that are non-ascii
inpTxt = Regex.Replace(inpTxt, "[^\u0000-\u007F]", String.Empty)
End Sub
如果inpTxt传递ByVal,则此函数不执行任何操作。它不会更改来自调用者的字符串,该赋值仅在Sub内部有效。您可以将Sub更改为函数并将其返回:
Public Shared Function RemoveIllegalCharacters(ByVal inpTxt As String) As String
'use a regular expression to replace any characters that are non-ascii
Return Regex.Replace(inpTxt, "[^\u0000-\u007F]", String.Empty)
End Function
并像这样使用它:
Dim cleaned = RemoveIllegalCharacters(inpTxt)
这似乎有效:
Dim inpTxt = "1234FOOBARR" + Chr(&H80)
Console.WriteLine(inpTxt) 'Prints "1234FOOBARR?"
Dim cleaned = RemoveIllegalCharacters(inpTxt)
Console.WriteLine(cleaned) 'Prints "1234FOOBARR"