Question

我在单元格中有以下字符串：

我想将字符串拆分为一个只包含文本字（例如'CRMNegocios'）的数组，而不包含任何项目符号，新行等...

为此，我写了下面的代码：

Sub Button1_Click()

    Dim stringsToCheck As Variant
    Dim element As Variant
    Dim stripped As String

    'Split cell value per vbLf
    stringsToCheck = Split(Cells(42, 10).Value, vbLf)
    MsgBox ("Total length of stringsToCheck is " & CStr(UBound(stringsToCheck)))

    'Remove special characters - for testing only, it will set the cell with the last value of the array
    For Each element In stringsToCheck
        stripped = GetStrippedText(CStr(element))
        Cells(42, 15) = stripped
    Next element


End Sub

Private Function GetStrippedText(txt As String) As String
    Dim regEx As Object

    Set regEx = CreateObject("vbscript.regexp")

    '\u0000-\u007F is for other special characters
    regEx.Pattern = "[\u25A0\u00A0\u0000-\u007F]"
    GetStrippedText = regEx.Replace(txt, "")

End Function

子弹被删除（它是\u25A0），但我仍然在文字后面留下\u00A0个字符：

我已经检查了regex is matching，为什么它没有在VBA中删除？

如评论中所述，单元格中的原始文本：

■         CRMNegocios
■         GestiondeProyectos
■         Emblue
■         Videoconferencia

代码运行后测试单元格中的文本：

Videoconferencia

Answer 1

我建议使用"^[\u25A0\u00A0\s]+"删除所有标准空格，不间断空格和项目符号。匹配：

^ - 字符串开头
[\u25A0\u00A0\s]+ - 出现1次或多次：
- \u25A0 - 项目符号
- \u00A0 - 不间断的空格
- \s - [ \r\t\n\f]空白

你的正则表达式不是全局匹配的，所以匹配并删除第一个项目符号后，它就会停止。然后，您的正则表达式还包含定义所有ASCII字符的u0000-\u007F范围。如果按原样使用Replace，它将删除字符串中的所有ASCII字母，数字和所有ASCII符号。这就是为什么在添加.Global = True以匹配所有匹配项时删除了您的文字的原因。

请注意，如果您只处理ASCII文本，并且需要从字符串的开头删除任何非单词字符，则可以使用regEx.Pattern = "^\W+"（无需将.Global设置为< EM>真）。

从字符串中删除\ u00A0字符

1 个答案: