使用VBA / Python在Excel中格式化和修改字符串

时间:2018-10-30 14:20:02

标签: python excel vba python-3.x openpyxl

我正在尝试编写一个VBA脚本,该脚本遍历一列单元格,其中一个在HTML <u></u>标签和两个标签之间加下划线,然后从文本中删除这些标签。单元格中可能有多个标签,两个之后是其他文本,或者根本没有标签。

到目前为止,我已经能够使脚本在标签之间加下划线,但是当我尝试删除它们时,实际上没有任何作用(有时什么都没有改变,有时标签下划线了,等等)。为了简洁起见,我省略了输入/输出示例,希望我的代码存在明显的明显问题,但可以根据要求提供。

尝试使用VBA解决此问题最初是因为我无法在Python中执行此操作,因为对象模型的使用率仅与单元格一样低,而不是单元格的内容。任何使用Python做到这一点的解决方案也将不胜感激!

非常感谢您的帮助!让我知道我还有什么可以帮助您的!

Sub PleaseUnderline()
'Holds the content between the tags
Dim s As String
'Holds the row number of the active cell
Dim a As Integer
'Holds the location of the beginning of the open tag
Dim b As Integer
'Holds the location of the beginning of the close tag
Dim e As Integer
Dim holder As String
    'Select the last cell in column A and make it the active cell
    Range("A" & ActiveCell.SpecialCells(xlLastCell).Row).Select
    For a = ActiveCell.Row To 1 Step -1
        Range("A" & a).Select
        holder = Range("A" & a).Value
        s = ""
        b = 1
        e = 1
        Do
            b = InStr(b, ActiveCell, "<u>")
            If b = 0 Then Exit Do
            e = b + 1
            e = InStr(e, ActiveCell, "</u>")
            If e = 0 Then
                Exit Do
            Else
                s = Mid(ActiveCell, b + 3, e - b - 3)
            End If
            holder = Replace(holder, "<u>", "", 1, 1)
            holder = Replace(holder, "</u>", "", 1, 1)
            Worksheets("Sheet").Range("A" & a).Value = holder
            ActiveCell.Characters(b, Len(s)).Font.Underline = True
            b = e + 1
        Loop
    Next a
End Sub

2 个答案:

答案 0 :(得分:0)

稍作修改,但这对我有用。我认为问题在于您在起点(b + 3)上加了3,因为您已经在前面删除了<u>,所以不需要偏移3个字符。

Sub PleaseUnderline()

Dim i As Long, j As Long
Dim startpoint As Long, endpoint As Long
Dim holder As String

For i = 2 To Cells(Rows.Count, 1).End(xlUp).Row

    If InStr(Cells(i, 1).Value, "<u>") > 0 Then
        For j = 1 To Len(Cells(i, 1).Value)
            If Mid(Cells(i, 1).Value, j, 3) = "<u>" Then
                startpoint = j
            End If

            If Mid(Cells(i, 1).Value, j, 4) = "</u>" Then
                endpoint = j
            End If
        Next j

        holder = Cells(i, 1).Value
        holder = Replace(holder, "<u>", "")
        holder = Replace(holder, "</u>", "")
        Cells(i, 1).Value = holder
        Cells(i, 1).Characters(startpoint, endpoint - startpoint - 3).Font.Underline = True

    End If

Next i

End Sub

img1

答案 1 :(得分:0)

这对我有用:

Sub Tester()
    DoTags ActiveSheet.Range("A1")
End Sub

Sub DoTags(c As Range)

    Dim s As Long, e As Long, l As Long, arrTags, tag

    arrTags = Array("b", "i", "u")

    For Each tag In arrTags

        Positions c.Value, tag, s, e

        Do While s > 0 And e > 0
            With c.Characters(s + Len(tag) + 2, e - s).Font
                Select Case LCase(tag)
                    Case "u": .Underline = True
                    Case "b": .Bold = True
                    Case "i": .Italic = True
                End Select
            End With
            c.Characters(e, Len(tag) + 3).Delete '<<delete end tag first...
            c.Characters(s, Len(tag) + 2).Delete
            Positions c.Value, tag, s, e
        Loop

    Next tag
End Sub

'set start and end positions of a tag in a string
Sub Positions(txt As String, tag, ByRef s As Long, ByRef e As Long)
    e = 0: s = 0
    s = InStr(1, txt, "<" & tag & ">", vbTextCompare)
    If s > 0 Then e = InStr(s, txt, "</" & tag & ">", vbTextCompare)
End Sub

编辑:由于您的某些内容似乎对于上述方法来说太长了,因此这是另一种方法(通用HTML >>格式的文本转换)

Sub Tester()
    Dim c As Range
    For Each c In ActiveSheet.Range("A2:C2").Cells
        HTMLtoFormattedText c
    Next c
End Sub

Private Sub HTMLtoFormattedText(c As Range)

    Dim objData As DataObject 'reference to "Microsoft Forms 2.0 Object Library"
    Set objData = New DataObject

    objData.SetText "<HTML>" & c.Text & "</HTML>"
    objData.PutInClipboard

    c.Parent.Activate
    c.Offset(1, 0).Select
    c.Parent.PasteSpecial Format:="Unicode Text"

End Sub