我已经检查了一些建议,重新调整领先和在vba中尾随空格(excel,顺便说一句)。
我已经找到了这个解决方案,但它也减少了(äö(也有上限)而且我在正则表达式上太弱了以至于无法理解:
Function MultilineTrim (Byval TextData)
Dim textRegExp
Set textRegExp = new regexp
textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}"
textRegExp.Global = False
textRegExp.IgnoreCase = True
textRegExp.Multiline = True
If textRegExp.Test (TextData) Then
MultilineTrim = textRegExp.Replace (TextData, "$1")
Else
MultilineTrim = ""
End If
End Function
(这是来自SO的答案,其中useraccount似乎无效:
https://stackoverflow.com/a/1606433/3701019 )
所以,我很乐意,如果有人可以帮助(a)问题的替代解决方案,或(b)不会剥离(单个)åäö字符的正则表达式/代码版本。
感谢您的帮助!
详细说明: 问题
我的上下文是vba中的xmlparser,它获取要解析的xml块。它有时只是从流中获取一个角色,这可能是åäö,然后这个功能完全剥离。
当然,我很乐意澄清或编辑这个问题。
仅供参考:我根据答案分享了我的所作所为,见下文。
答案 0 :(得分:4)
对于正则表达式,我会使用:
^[\s\xA0]+|[\s\xA0]+$
这将匹配“常用”空格字符以及HTML文档中常见的NBSP。
VBA代码如下所示,其中S是修剪线:
Dim RE as Object, ResultString as String
Set RE = CreateObject("vbscript.regexp")
RE.MultiLine = True
RE.Global = True
RE.Pattern = "^[\s\xA0]+|[\s\xA0]+$"
ResultString = RE.Replace(S, "")
正则表达式的解释:
Trim whitespace at the start and the end of each line
-----------------------------------------------------
^[\s\xA0]+|[\s\xA0]+$
Options: ^$ match at line breaks
Match this alternative (attempting the next alternative only if this one fails) «^[\s\xA0]+»
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match a single character present in the list below «[\s\xA0]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
The character with position 0xA0 (160 decimal) in the character set «\xA0»
Or match this alternative (the entire match attempt fails if this one fails to match) «[\s\xA0]+$»
Match a single character present in the list below «[\s\xA0]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
The character with position 0xA0 (160 decimal) in the character set «\xA0»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
Created with RegexBuddy
答案 1 :(得分:1)
您可以创建一个自定义功能,去除您不想要的特定字符。
Private Function CleanMyString(sInput As String) As String
Dim sResult As String
' Remove leading ans trailing spaces
sResult = Trim(sInput)
'Remove other characters that you dont want
sResult = Replace(sResult, chr(10), "")
sResult = Replace(sResult, chr(13), "")
sResult = Replace(sResult, chr(9), "")
End Function
这不使用正则表达式。不确定这是否符合您的要求?
答案 2 :(得分:1)
试试这个:
Function MultilineTrim (Byval TextData)
Dim textRegExp
Set textRegExp = new regexp
textRegExp.Pattern = "(^[ \t]+|[ \t]+$)"
textRegExp.Global = True
textRegExp.IgnoreCase = True
textRegExp.Multiline = True
MultilineTrim = textRegExp.Replace (TextData, "")
End Function
答案 3 :(得分:1)
在consulting with stackexchange people on how to do this之后,我将问题的编辑添加为我自己的答案。这是:
感谢答案,这就是我将要使用的内容:
Function MultilineTrim(ByVal TextData)
MultilineTrim = textRegExp.Replace(TextData, "")
' If textRegExp.Test(TextData) Then
' MultilineTrim = textRegExp.Replace(TextData, "$1")
' Else
' MultilineTrim = "" ' ??
' End If
End Function
Private Sub InitRegExp()
Set textRegExp = New RegExp
'textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}" 'this removes å ä ö - bug!
'textRegExp.Global = False
'textRegExp.Pattern = "(^[ \t]+|[ \t]+$)" ' leaves a line break at start
textRegExp.Pattern = "^[\s\xA0]+|[\s\xA0]+$" ' works! Ron Rosenfelds submit
textRegExp.Global = True
textRegExp.IgnoreCase = True
textRegExp.MultiLine = True
End Sub
再次感谢所有人! (向Ron Rosenfeld致意)
答案 4 :(得分:0)
重构和改进Richard Vivians版本
Function cleanMyString(sInput)
' Remove leading and trailing spaces
sInput = Trim(sInput)
'Remove other characters that you dont want
sInput = Replace(sInput, Chr(10), "")
sInput = Replace(sInput, Chr(13), "")
sInput = Replace(sInput, Chr(9), "")
cleanMyString = sInput
End Function
答案 5 :(得分:-2)
我会在替换所有其他角色后调用Trim。这样,如果在其他字符之后有空格,它们也将被删除。