如何使用Excel正则表达式从字符串中提取广告尺寸

时间:2018-01-24 16:31:59

标签: regex excel vba excel-vba

我正在尝试从字符串中提取广告尺寸。广告尺寸均为标准尺寸。因此,虽然我更喜欢使用寻找模式的正则表达式,但IE 3数字后跟2或3个数字,硬编码也可以,因为我们知道尺寸是多少。以下是一些广告尺寸的示例:

300x250的

728x90的

320x50的

我能够找到一些我修改过的VBScript几乎可以工作,但由于我搜索的字符串不一致,所以在某些情况下它会拉得太多。例如:

enter image description here

您会看到它在每个实例中都无法正确匹配。

我找到的VB代码实际上是匹配除广告尺寸之外的所有内容。我对VBScript的了解不够充分,只需查看广告尺寸并将其拉出即可。因此,它会查找所有其他文本并将其删除。

代码如下。有没有办法修复正则表达式,以便它只返回广告尺寸?

Function getAdSize(Myrange As Range) As String
    Dim regEx As New RegExp
    Dim strPattern As String
    Dim strInput As String
    Dim strReplace As String
    Dim strOutput As String

    strPattern = "([^300x250|728x90])"

    If strPattern <> "" Then
        strInput = Myrange.Value
        strReplace = ""

        With regEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = True
            .Pattern = strPattern
        End With

        If regEx.Test(strInput) Then
            getAdSize = regEx.Replace(strInput, strReplace)
        Else
            getAdSize = "Not matched"
        End If
    End If
End Function

注意,数据并不总是由一个人提前完成,有时它是一个闪光或空间之前和之后的空间。

3 个答案:

答案 0 :(得分:2)

我已设法提供约95%的必需答案 - 以下RegEx将删除DDDxDD大小,并将返回其余部分。

Option Explicit

Public Function regExSampler(s As String) As String

    Dim regEx           As Object
    Dim inputMatches    As Object
    Dim regExString     As String

    Set regEx = CreateObject("VBScript.RegExp")

    With regEx
        .Pattern = "(([0-9]+)x([0-9]+))"
        .IgnoreCase = True
        .Global = True

        Set inputMatches = .Execute(s)

        If regEx.test(s) Then
            regExSampler = .Replace(s, vbNullString)
        Else
            regExSampler = s
        End If

    End With

End Function

Public Sub TestMe()

    Debug.Print regExSampler("uni3uios3_300x250_ASDF.html")
    Debug.Print regExSampler("uni3uios3_34300x25_ASDF.html")
    Debug.Print regExSampler("uni3uios3_8x4_ASDF.html")

End Sub

E.g。你会得到:

uni3uios3__ASDF.html
uni3uios3__ASDF.html
uni3uios3__ASDF.html

从这里,您可以继续尝试找到一种方法来反转显示。

修改 要从95%变为100%,I have asked a question here并且事实证明条件块应该更改为以下内容:

If regEx.test(s) Then
    regExSampler = InputMatches(0)
Else
    regExSampler = s
End If

答案 1 :(得分:2)

编辑:由于它实际上没有下划线,我们无法使用Split。然而,我们可以迭代字符串并提取&#34; #x#&#34;手动。我已更新代码以反映这一点并验证它是否成功。

Public Function ExtractAdSize(ByVal arg_Text As String) As String

    Dim i As Long
    Dim Temp As String
    Dim Ad As String

    If arg_Text Like "*#x#*" Then
        For i = 1 To Len(arg_Text) + 1
            Temp = Mid(arg_Text & " ", i, 1)
            If IsNumeric(Temp) Then
                Ad = Ad & Temp
            Else
                If Temp = "x" Then
                    Ad = Ad & Temp
                Else
                    If Ad Like "*#x#*" Then
                        ExtractAdSize = Ad
                        Exit Function
                    Else
                        Ad = vbNullString
                    End If
                End If
            End If
        Next i
    End If

End Function

使用Select Case布尔逻辑而不是嵌套的If语句的相同函数的替代版本:

Public Function ExtractAdSize(ByVal arg_Text As String) As String

    Dim i As Long
    Dim Temp As String
    Dim Ad As String

    If arg_Text Like "*#x#*" Then
        For i = 1 To Len(arg_Text) + 1
            Temp = Mid(arg_Text & " ", i, 1)

            Select Case Abs(IsNumeric(Temp)) + Abs((Temp = "x")) * 2 + Abs((Ad Like "*#x#*")) * 4
                Case 0: Ad = vbNullString       'Temp is not a number, not an "x", and Ad is not valid
                Case 1, 2, 5: Ad = Ad & Temp    'Temp is a number or an "x"
                Case 4, 6: ExtractAdSize = Ad   'Temp is not a number, Ad is valid
                           Exit Function
            End Select
        Next i
    End If

End Function

答案 2 :(得分:1)

如果此公式总是3个字符,然后是x,并且它总是在下划线之间,则可以使用此公式 - 相应地进行调整。

=iferror(mid(A1,search("_???x*_",A1)+1,search("_",A1,search("_???x*_",A1)+1)-(search("_???x*_",A1)+1)),"No match")