如何在字符串中删除¼
,¢
,®
,»
等所有特殊字符?通过特殊或垃圾字符,我的意思是在普通键盘上不容易键入的字符。这可以用正则表达式完成吗?
答案 0 :(得分:4)
您可以使用Regex.Replace
执行此操作:
Dim input As String = "Hello World ® and StackOverflow ¼"
Dim result As String = (New Regex("[^a-zA-Z0-9 \!\.\[\]\(\)]")).Replace(input, "")
'result: "Hello World and StackOverflow "
在此示例中,除了a-z,A-Z,0-9和某些字符之外的所有字符都将被删除(白名单)。
您还可以使用以下内容处理字符映射:
Dim input As String = "Hello World ® and StackOverflow ¼"
Dim strClean As String = ""
For Each charItem As Char In input
If Asc(charItem) > 127 Then
Continue For
Else
strClean &= charItem
End If
Next
'strClean: "Hello World and StackOverflow "
在此示例中,删除了扩展ASCII代码中的所有字符(ASCII table)。
正如@StevenDoggart已经在评论中提到的那样,你也可以使用categories和named-blocks来解决这个问题:
Dim input As String = "Hello World, ® and StackOverflow ¼ ¢ »!.? ({[]})"
Dim result As String = (New Regex("[^\p{L}\p{Po}\p{Ps}\p{Pe}\p{Z}]")).Replace(input, "")
'result: "Hello World, and StackOverflow !.? ({[]})"
或以下解决方案:
Dim input As String = "Hello World, ® and StackOverflow ¼ ¢ »!.? ({[]})"
Dim result As String = (New Regex("[^\p{IsBasicLatin}]")).Replace(input, "")
'result: "Hello World, and StackOverflow !.? ({[]})"