我有这个代码,我似乎仍然无法用简单的“占位符”替换我的数据中的非英语字符,如越南语或泰语。
Sub NonLatin()
Dim cell As Range
For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
s = cell.Value
For i = 1 To Len(s)
If Mid(s, i, 1) Like "[!A-Za-z0-9@#$%^&* * ]" Then cell.Value = "placeholder"
Next
Next
End Sub
感谢您的帮助
答案 0 :(得分:1)
您可以替换e之外的任何字符。 G。使用以下代码的ASCII范围(前128个字符)和占位符:
Option Explicit
Sub Test()
Dim oCell As Range
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "[^u0000-u00F7]"
For Each oCell In [A1:C4]
oCell.Value = .Replace(oCell.Value, "*")
Next
End With
End Sub
答案 1 :(得分:0)
有关在VBA代码中使用正则表达式的详细信息,请参阅this question。
然后在像这样的函数中使用正则表达式来处理字符串。在这里,我假设您要使用占位符替换每个无效的字符,而不是整个字符串。如果它是整个字符串,那么您不需要进行单独的字符检查,只需在正则表达式模式中使用+
或*
限定符来表示多个字符,并将整个字符串一起测试。
Function LatinString(str As String) As String
' After including a reference to "Microsoft VBScript Regular Expressions 5.5"
' Set up the regular expressions object
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
' This is the pattern of ALLOWED characters.
' Note that special characters should be escaped using a slash e.g. \$ not $
.Pattern = "[A-Za-z0-9]"
End With
' Loop through characters in string. Replace disallowed characters with "?"
Dim i As Long
For i = 1 To Len(str)
If Not regEx.Test(Mid(str, i, 1)) Then
str = Left(str, i - 1) & "?" & Mid(str, i + 1)
End If
Next i
' Return output
LatinString = str
End Function
您可以通过
在代码中使用它Dim cell As Range
For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
cell.Value = LatinString(cell.Value)
Next
对于将Unicode字符串转换为UTF8字符串而不使用正则表达式的字节级方法,请查看this article