单词评论提取:帮助获得编号标题

时间:2016-09-27 09:47:01

标签: excel vba ms-word header comments

两周前我决定学习VBA,而且它已经相当顺利了。然而,现在,我遇到了一个我自己似乎无法解决的问题。  我已经设置了一个包含各种模块的excel文档。其中一个模块从word文档中提取注释到excel表 - 它按预期工作。

问题是,我无法提取每个评论上方的第一个编号标题,我非常喜欢。目前,我必须在提取注释后手动执行此操作。作为一个例子,我想提取每个评论上方的第一个标题和数字,例如“2.1.1标题”。如果注释突出显示标题本身,则它应该是提取的标题。

我根据我在网上找到的东西尝试了各种各样的东西,但每次遇到各种各样的错误我都无法解决。我还没有找到一些甚至可以分类的东西。我确实尝试了一种显然应该在Word VBA中工作的方法,但我无法在Excel中使用它。

有谁知道如何提取编号标题?任何提示或提示将不胜感激。

这是我对模块的代码:

Sub ImportCommentsDOCX()
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim i As Integer

    wdFileName = Application.GetOpenFilename("Word files (*.docx),*.docx", , _
        "Browse for file to be imported")
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
    Set wdDoc = GetObject(wdFileName) 'open Word file
    '1: if no comments'
    With wdDoc
        If wdDoc.Comments.Count = 0 Then
            MsgBox ("No comments")
        End If
        '2; Set excel headers'
        Range("B" & 1).Value = "Number"
        Range("B" & 1).Font.Bold = True
        Range("C" & 1).Value = "Comment"
        Range("C" & 1).Font.Bold = True
        Range("D" & 1).Value = "Highlighted text"
        Range("D" & 1).Font.Bold = True
        Range("E" & 1).Value = "Initials"
        Range("B" & 1).Font.Bold = True
        Range("F" & 1).Value = "Date (*Imprecise)"
        Range("F" & 1).Font.Bold = True

        '3: Extract comments and meta data'
        For i = 1 To wdDoc.Comments.Count
            Range("B" & 1 + i).Value = wdDoc.Comments(i).Index
            Range("C" & 1 + i).Value = wdDoc.Comments(i).Range
            Range("D" & 1 + i).Value = wdDoc.Comments(i).Scope.FormattedText
            Range("E" & 1 + i).Value = wdDoc.Comments(i).Initial
            Range("F" & 1 + i).Value = Format(wdDoc.Comments(i).Date, "dd/MM/yyyy") 'Unreliable: Sometimes gives wrong date'
            'Range("G" & 3 + i).Value = wdDoc.Comments(i).Range.ListFormat.ListString 'Returns empty'
        Next i
    End With
    Set wdDoc = Nothing
    MsgBox ("Extraction has completed")
End Sub

2 个答案:

答案 0 :(得分:0)

以下是您的代码,并进行了一些调整:

Sub ImportCommentsDOCX()
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim i As Integer

    wdFileName = Application.GetOpenFilename("Word files (*.docx),*.docx", , _
        "Browse for file to be imported")
    If wdFileName = False Then Exit Sub '(user cancelled import file browser)
    Set wdDoc = GetObject(wdFileName) 'open Word file
    '1: if no comments'
    With wdDoc
        wdDoc.Activate ' Added
        If wdDoc.Comments.Count = 0 Then
            MsgBox ("No comments")
        End If
        '2; Set excel headers'
        Range("B" & 1).Value = "Number"
        Range("B" & 1).Font.Bold = True
        Range("C" & 1).Value = "Comment"
        Range("C" & 1).Font.Bold = True
        Range("D" & 1).Value = "Highlighted text"
        Range("D" & 1).Font.Bold = True
        Range("E" & 1).Value = "Initials"
        Range("E" & 1).Font.Bold = True ' Modified
        Range("F" & 1).Value = "Date (*Imprecise)"
        Range("F" & 1).Font.Bold = True

        '3: Extract comments and meta data'
        For i = 1 To wdDoc.Comments.Count
            Range("B" & 1 + i).Value = wdDoc.Comments(i).Index
            Range("C" & 1 + i).Value = wdDoc.Comments(i).Range
            Range("D" & 1 + i).Value = wdDoc.Comments(i).Scope.FormattedText
            Range("E" & 1 + i).Value = wdDoc.Comments(i).Initial
            Range("F" & 1 + i).Value = Format(wdDoc.Comments(i).Date, "dd/MM/yyyy") 'Unreliable: Sometimes gives wrong date'
            'Range("G" & 1 + i).Value = wdDoc.Comments(i).Scope.ListFormat.ListString 'Returns empty'    ' Modified ' Updated
            Dim wp As Word.Paragraph: Set wp = wdDoc.Comments(i).Scope.Paragraphs(1) ' Updated
            Do While wp.Range.ListFormat.ListString = "" ' Updated
                Set wp = wp.Previous ' Updated
            Loop ' Updated
            Range("G" & 1 + i).Value = wp.Range.ListFormat.ListString ' Updated
        Next i
    End With
    Set wdDoc = Nothing
    MsgBox ("Extraction has completed")
End Sub

请注意我的评论:已添加和修改

  1. wdDoc.Activate至少在我的电脑上是必需的,否则就是 范围属性为空。
  2. 在首字母缩写后,错误的列以粗体显示
  3. 原始文本由Range属性引用,而不是Scope(注释的内容),因此应使用其ListFormat属性
  4. 行索引不正确(3而不是1)

答案 1 :(得分:0)

看起来对我有用:

这需要 Microsoft VBScript 正则表达式 5.5

Sub commentaires()

   Dim regexOne As Object
   Set regexOne = New RegExp

   regexOne.Pattern = "^\d+\."

   Dim s As String, s1 As String
   Dim cmt As Word.Comment
   Dim doc As Word.Document

   For Each cmt In ActiveDocument.Comments

      Dim wp As Word.Paragraph
      Set wp = cmt.Scope.Paragraphs(1) ' Updated
      Do While Not regexOne.Test(wp.Range.ListFormat.ListString)
         Set wp = wp.Previous ' Updated
      Loop ' Updated
        
      s = s & _
        wp.Range.ListFormat.ListString & ";" & _
        cmt.Reference.Information(wdActiveEndAdjustedPageNumber) & ";""" & _
        cmt.Scope & """;""" & _
        cmt.Range.Text & """ " & vbCr

    Next

    Dim f As Integer
    f = FreeFile   
    Open "c:\comments.csv" For Output As #f
    Print #f, s
    Close #f

End Sub