将字符串的一部分从一组字符删除为另一字符

时间:2019-06-10 14:06:19

标签: vba excel-vba

我有一个Excel工作表,我想删除一些HTML标记。我的问题是,有些标签不仅是简单的<div>标签,而且还有<div class="ExternalClassEA74AB3F178E48EDAD3BDE4FC90B1182">之类的多余字符,请从<div进行替换,直到到达标签的结尾>。如何用“”替换这样的字符串部分。谢谢。

2 个答案:

答案 0 :(得分:2)

Sub RemoveDivs()
    Dim html$
    html = "Some other text<div class=""ExternalClassEA74AB3F178E48EDAD3BDE4FC90B1182""> and here too"
    With CreateObject("VBScript.RegExp")
        .Pattern = "<div.*?>": .Global = True
        html = .Replace(html, "")
    End With
    MsgBox html
End Sub

答案 1 :(得分:0)

这不是一个完美的解决方案,如果您将<>作为要保留的实际文本,很可能无法实现您期望的效果。

它应该为您提供一个起点,并且应该能够进行一些更改以实现“完美”。

查看评论以获取更多详细信息:

Sub htmlStrip()

Dim ws As Worksheet: Set ws = ActiveWorkbook.Sheets("Sheet Name Here") '<- set sheet name
Dim lRow As Long: lRow = ws.Cells(Rows.Count, 1).End(xlUp).Row 'get last row
Dim lCol As Long: lCol = ws.Cells(1, Columns.Count).End(xlToLeft).Column 'get last column

Dim arrData As Variant: arrData = ws.Range("A1").Resize(lRow, lCol) 'get the data into an array
Dim R As Long, C As Long, X As Long, Z As Long

Const sS As String = "<" 'html tag start
Const sE As String = ">" 'html tag end

Dim lS As Long, lE As Long, lCnt As Long, lECnt As Long
Dim strTmpS As String, strTmpE As String, strTemp As String, strReplace As String


For R = LBound(arrData) To UBound(arrData) 'iterate through all rows of data
    For C = LBound(arrData, 2) To UBound(arrData, 2) 'iterate through all columns of data
        lS = InStr(1, arrData(R, C), sS) 'get the location of the first tag
        lE = InStr(1, arrData(R, C), sE) 'get the location of the last tag

        If lS > 0 And lE > 0 Then 'if at least one of each found
            If lE < lS Then lE = InStr(lS + 1, arrData(R, C), sE) 'prevent a case when the first tag is the ending one

            lSCnt = Len(arrData(R, C)) - Len(Replace(arrData(R, C), sS, "")) 'check how many times we have the first tag
            lECnt = Len(arrData(R, C)) - Len(Replace(arrData(R, C), sE, "")) 'check how many times we have the last tag

            Z = WorksheetFunction.Min(lSCnt, lECnt) 'avoid a situation when we have some opening or closing tags, but not the matching ones

            For X = 1 To Z 'iterate through the number of times at least both tags are in
                strReplace = Mid(arrData(R, C), lS, lE - lS + 1) 'get the string to replace
                arrData(R, C) = Replace(arrData(R, C), strReplace, "") 'remove the tag found

                lS = InStr(1, arrData(R, C), sS) 'get the location of the first tag (again)
                lE = InStr(1, arrData(R, C), sE) 'get the location of the last tag (again)

                If lS = 0 Or lE = 0 Then Exit For 'either we reached the end of the loop, or did a multi replace - so exit here
            Next X
        End If
    Next C
Next R

ws.Range("A1").Resize(lRow, lCol).Offset(0, lCol) = arrData 'put the data back on the spreadsheet, at the right of the original data

End Sub