如何使用正则表达式模式在括号中提取子字符串

时间:2012-06-05 19:10:43

标签: regex vba substring

这可能是一个简单的问题,但不幸的是我无法得到我想要的结果......

说,我有以下一行:

"Wouldn't It Be Nice" (B. Wilson/Asher/Love)

我必须寻找这种模式:

" (<any string>)

为了检索:

B. Wilson/Asher/Love

我尝试了类似"" (([^))]*))的内容,但它似乎无法运作。此外,我还想使用Match.Submatches(0),这可能会使事情变得复杂,因为它依赖于括号......

6 个答案:

答案 0 :(得分:18)

编辑:检查文档后,问题是括号前面有不间断的空格,而不是常规空格。所以这个正则表达式应该有效:""[ \xA0]*\(([^)]+)\)

""       'quote (twice to escape)
[ \xA0]* 'zero or more non-breaking (\xA0) or a regular spaces
\(       'left parenthesis
(        'open capturing group
[^)]+    'anything not a right parenthesis
)        'close capturing group
\)       'right parenthesis

在一个功能中:

Public Function GetStringInParens(search_str As String)
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches
    GetStringInParens = ""
    regEx.Pattern = """[ \xA0]*\(([^)]+)\)"
    regEx.Global = True
    If regEx.test(search_str) Then
        Set matches = regEx.Execute(search_str)
        GetStringInParens = matches(0).SubMatches(0)
    End If
End Function

答案 1 :(得分:3)

对你的问题不是一个严格的答案,但有时候,对于这个简单,好的字符串函数比正则表达式更容易混淆和简洁。

Function BetweenParentheses(s As String) As String
    BetweenParentheses = Mid(s, InStr(s, "(") + 1, _
        InStr(s, ")") - InStr(s, "(") - 1)
End Function

用法:

Debug.Print BetweenParentheses("""Wouldn't It Be Nice"" (B. Wilson/Asher/Love)")
'B. Wilson/Asher/Love

编辑 @alan指出我们会错误地匹配歌曲标题中括号的内容。只需稍加修改即可轻松规避这一点:

Function BetweenParentheses(s As String) As String
    Dim iEndQuote As Long
    Dim iLeftParenthesis As Long
    Dim iRightParenthesis As Long

    iEndQuote = InStrRev(s, """")
    iLeftParenthesis = InStr(iEndQuote, s, "(")
    iRightParenthesis = InStr(iEndQuote, s, ")")

    If iLeftParenthesis <> 0 And iRightParenthesis <> 0 Then
        BetweenParentheses = Mid(s, iLeftParenthesis + 1, _
            iRightParenthesis - iLeftParenthesis - 1)
    End If
End Function

用法:

Debug.Print BetweenParentheses("""Wouldn't It Be Nice"" (B. Wilson/Asher/Love)")
'B. Wilson/Asher/Love
Debug.Print BetweenParentheses("""Don't talk (yell)""")
' returns empty string

当然,这比以前简洁!

答案 2 :(得分:2)

这是一个很好的正则表达式

".*\(([^)]*)

在VBA / VBScript中:

Dim myRegExp, ResultString, myMatches, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = """.*\(([^)]*)"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count >= 1 Then
    Set myMatch = myMatches(0)
    If myMatch.SubMatches.Count >= 3 Then
        ResultString = myMatch.SubMatches(3-1)
    Else
        ResultString = ""
    End If
Else
    ResultString = ""
End If

匹配

Put Your Head on My Shoulder

in

"Don't Talk (Put Your Head on My Shoulder)"  

更新1

我让你的doc文件中的正则表达式松散,并按要求匹配。确定正则表达式没问题。我不能说流利的VBA / VBScript,但我的猜测是它出错的地方

如果你想进一步讨论正则表达式,那对我很好。我并不急于开始深入研究这个神秘的VBscript API。

鉴于新输入,正则表达式被调整为

".*".*\(([^)]*)

所以它不会错误地匹配(把你的头放在我的肩膀上)出现在引号内。

enter image description here

答案 3 :(得分:1)

此函数适用于您的示例字符串:

Function GetArtist(songMeta As String) As String
  Dim artist As String
  ' split string by ")" and take last portion
  artist = Split(songMeta, "(")(UBound(Split(songMeta, "(")))
  ' remove closing parenthesis
  artist = Replace(artist, ")", "")
End Function

例如:

Sub Test()

  Dim songMeta As String

  songMeta = """Wouldn't It Be Nice"" (B. Wilson/Asher/Love)"

  Debug.Print GetArtist(songMeta)

End Sub

将“B. Wilson / Asher / Love”打印到即时窗口。

它也解决了问题alan mentioned。例如:

Sub Test()

  Dim songMeta As String

  songMeta = """Wouldn't (It Be) Nice"" (B. Wilson/Asher/Love)"

  Debug.Print GetArtist(songMeta)

End Sub

还将“B. Wilson / Asher / Love”打印到即时窗口。除非当然,艺术家的名字也包括括号。

答案 4 :(得分:0)

我认为您需要一个更好的数据文件;)您可能需要考虑将文件预处理到临时文件进行修改,以便将不适合您的模式的异常值修改为它们将满足您的模式的位置。这样做有点费时,但是当数据文件缺乏一致性时总是很困难。

答案 5 :(得分:0)

这另一个正则表达式已用vbscript (?:\()(.*)(?:\)) Demo Here

进行了测试
Data = """Wouldn't It Be Nice"" (B. Wilson/Asher/Love)"
wscript.echo Extract(Data)
'---------------------------------------------------------------
Function Extract(Data)
Dim strPattern,oRegExp,Matches
strPattern = "(?:\()(.*)(?:\))"
Set oRegExp = New RegExp
oRegExp.IgnoreCase = True 
oRegExp.Pattern = strPattern
set Matches = oRegExp.Execute(Data) 
If Matches.Count > 0 Then Extract = Matches(0).SubMatches(0)
End Function
'---------------------------------------------------------------