Question

我有这个代码，我似乎仍然无法用简单的“占位符”替换我的数据中的非英语字符，如越南语或泰语。

Sub NonLatin()
Dim cell As Range
    For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
        s = cell.Value
            For i = 1 To Len(s)
                If Mid(s, i, 1) Like "[!A-Za-z0-9@#$%^&* * ]" Then cell.Value = "placeholder"
            Next
    Next
End Sub

感谢您的帮助

Answer 1

您可以替换e之外的任何字符。 G。使用以下代码的ASCII范围（前128个字符）和占位符：

Option Explicit

Sub Test()

    Dim oCell As Range

    With CreateObject("VBScript.RegExp")
        .Global = True
        .Pattern = "[^u0000-u00F7]"
        For Each oCell In [A1:C4]
            oCell.Value = .Replace(oCell.Value, "*")
        Next
    End With

End Sub

Answer 2

有关在VBA代码中使用正则表达式的详细信息，请参阅this question。

然后在像这样的函数中使用正则表达式来处理字符串。在这里，我假设您要使用占位符替换每个无效的字符，而不是整个字符串。如果它是整个字符串，那么您不需要进行单独的字符检查，只需在正则表达式模式中使用+或*限定符来表示多个字符，并将整个字符串一起测试。

Function LatinString(str As String) As String
    ' After including a reference to "Microsoft VBScript Regular Expressions 5.5"
    ' Set up the regular expressions object
    Dim regEx As New RegExp
    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        ' This is the pattern of ALLOWED characters. 
        ' Note that special characters should be escaped using a slash e.g. \$ not $
        .Pattern = "[A-Za-z0-9]"
    End With

    ' Loop through characters in string. Replace disallowed characters with "?"
    Dim i As Long
    For i = 1 To Len(str)
        If Not regEx.Test(Mid(str, i, 1)) Then
            str = Left(str, i - 1) & "?" & Mid(str, i + 1)
        End If
    Next i
    ' Return output
    LatinString = str
End Function

您可以通过

在代码中使用它

Dim cell As Range
For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
    cell.Value = LatinString(cell.Value)
Next

对于将Unicode字符串转换为UTF8字符串而不使用正则表达式的字节级方法，请查看this article

如何编写vba代码来删除和替换UTF8-Characters

2 个答案: