用于修剪vba中的前导和尾随空格的函数

时间:2014-06-04 22:04:55

标签: regex excel vba trim

我已经检查了一些建议,重新调整领先和在vba中尾随空格(excel,顺便说一句)。

我已经找到了这个解决方案,但它也减少了(äö(也有上限)而且我在正则表达式上太弱了以至于无法理解:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}"
    textRegExp.Global = False
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    If textRegExp.Test (TextData) Then
      MultilineTrim = textRegExp.Replace (TextData, "$1")
    Else
      MultilineTrim = ""
    End If
End Function

(这是来自SO的答案,其中useraccount似乎无效:

https://stackoverflow.com/a/1606433/3701019

所以,我很乐意,如果有人可以帮助(a)问题的替代解决方案,或(b)不会剥离(单个)åäö字符的正则表达式/代码版本。

感谢您的帮助!

详细说明: 问题

  • vba中的修剪函数不考虑所有空白字符(例如,制表符)。需要一些自定义修剪
  • 我找到的最佳解决方案是上面的,但它也删除了单个åäö字符。

我的上下文是vba中的xmlparser,它获取要解析的xml块。它有时只是从流中获取一个角色,这可能是åäö,然后这个功能完全剥离。

当然,我很乐意澄清或编辑这个问题。

仅供参考:我根据答案分享了我的所作所为,见下文。

6 个答案:

答案 0 :(得分:4)

对于正则表达式,我会使用:

^[\s\xA0]+|[\s\xA0]+$

这将匹配“常用”空格字符以及HTML文档中常见的NBSP。

VBA代码如下所示,其中S是修剪线:

Dim RE as Object, ResultString as String
Set RE = CreateObject("vbscript.regexp")
RE.MultiLine = True
RE.Global = True
RE.Pattern = "^[\s\xA0]+|[\s\xA0]+$"
ResultString = RE.Replace(S, "")

正则表达式的解释:

Trim whitespace at the start and the end of each line
-----------------------------------------------------

^[\s\xA0]+|[\s\xA0]+$

Options:  ^$ match at line breaks

Match this alternative (attempting the next alternative only if this one fails) «^[\s\xA0]+»
   Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
   Match a single character present in the list below «[\s\xA0]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
      The character with position 0xA0 (160 decimal) in the character set «\xA0»
Or match this alternative (the entire match attempt fails if this one fails to match) «[\s\xA0]+$»
   Match a single character present in the list below «[\s\xA0]+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
      The character with position 0xA0 (160 decimal) in the character set «\xA0»
   Assert position at the end of a line (at the end of the string or before a line break character) «$»

Created with RegexBuddy

答案 1 :(得分:1)

您可以创建一个自定义功能,去除您不想要的特定字符。

Private Function CleanMyString(sInput As String) As String
   Dim sResult As String

   ' Remove leading ans trailing spaces
   sResult = Trim(sInput)
   'Remove other characters that you dont want
   sResult = Replace(sResult, chr(10), "")
   sResult = Replace(sResult, chr(13), "")
   sResult = Replace(sResult, chr(9), "")

End Function

这不使用正则表达式。不确定这是否符合您的要求?

答案 2 :(得分:1)

试试这个:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "(^[ \t]+|[ \t]+$)"
    textRegExp.Global = True
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    MultilineTrim = textRegExp.Replace (TextData, "")
End Function

答案 3 :(得分:1)

consulting with stackexchange people on how to do this之后,我将问题的编辑添加为我自己的答案。这是:

接听/使用过的代码

感谢答案,这就是我将要使用的内容:

Function MultilineTrim(ByVal TextData)
    MultilineTrim = textRegExp.Replace(TextData, "")

'    If textRegExp.Test(TextData) Then
'        MultilineTrim = textRegExp.Replace(TextData, "$1")
'    Else
'        MultilineTrim = "" ' ??
'    End If
End Function

Private Sub InitRegExp()
    Set textRegExp = New RegExp
    'textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}" 'this removes å ä ö - bug!
    'textRegExp.Global = False

    'textRegExp.Pattern = "(^[ \t]+|[ \t]+$)" ' leaves a line break at start
    textRegExp.Pattern = "^[\s\xA0]+|[\s\xA0]+$" ' works! Ron Rosenfelds submit

    textRegExp.Global = True

    textRegExp.IgnoreCase = True
    textRegExp.MultiLine = True
End Sub

再次感谢所有人! (向Ron Rosenfeld致意)

答案 4 :(得分:0)

重构和改进Richard Vivians版本

Function cleanMyString(sInput)
    ' Remove leading and trailing spaces
    sInput = Trim(sInput)
    'Remove other characters that you dont want
    sInput = Replace(sInput, Chr(10), "")
    sInput = Replace(sInput, Chr(13), "")
    sInput = Replace(sInput, Chr(9), "")
    cleanMyString = sInput
End Function

答案 5 :(得分:-2)

我会在替换所有其他角色后调用Trim。这样,如果在其他字符之后有空格,它们也将被删除。