VBA在“从任何文件中恢复文本”模式下打开.doc

时间:2012-11-27 16:13:35

标签: vba ms-word word-vba .doc

我正在尝试将许多旧的.DOC文件转换为PDF格式或RTF格式。到目前为止,我找到了一个完成后者(转换为RTF),但旧Word应用程序的格式仍然存在于文档中。如果您打开Microsoft Word(我使用2010)并单击文件>打开,有一个下拉菜单,允许您选择“从任何文件中恢复文本()”。是否可以在转换过程中使用它来过滤掉.DOC文档中的格式数据?以下是我目前正在尝试修改的脚本的几个示例:

虽然它似乎只是将.rtf附加到文件的末尾而不是更改格式,但这个有用了:

Sub SaveAllAsDOCX()
Dim strFilename As String
Dim strDocName As String
Dim strPath As String
Dim oDoc As Document
Dim fDialog As FileDialog
Dim intPos As Integer
Set fDialog = Application.FileDialog(msoFileDialogFolderPicker)
With fDialog
    .Title = "Select folder and click OK"
    .AllowMultiSelect = False
    ..InitialView = msoFileDialogViewList
    If .Show <> -1 Then
        MsgBox "Cancelled By User", , "List Folder Contents"
        Exit Sub
    End If
    strPath = fDialog.SelectedItems.Item(1)
    If Right(strPath, 1) <> "\" Then strPath = strPath + "\"
End With
If Documents.Count > 0 Then
    Documents.Close SaveChanges:=wdPromptToSaveChanges
End If
If Left(strPath, 1) = Chr(34) Then
    strPath = Mid(strPath, 2, Len(strPath) - 2)
End If
strFilename = Dir$(strPath & "*.doc")
While Len(strFilename) <> 0
    Set oDoc = Documents.Open(strPath & strFilename)
    strDocName = ActiveDocument.FullName
    intPos = InStrRev(strDocName, ".")
    strDocName = Left(strDocName, intPos - 1)
    strDocName = strDocName & ".docx"
    oDoc.SaveAs FileName:=strDocName, _
        FileFormat:=wdFormatDocumentDefault
    oDoc.Close SaveChanges:=wdDoNotSaveChanges
    strFilename = Dir$()
Wend
End Sub

到目前为止,在任何转换中都没有成功:

Option Explicit
Sub ChangeDocsToTxtOrRTFOrHTML()
'with export to PDF in Word 2007
    Dim fs As Object
    Dim oFolder As Object
    Dim tFolder As Object
    Dim oFile As Object
    Dim strDocName As String
    Dim intPos As Integer
    Dim locFolder As String
    Dim fileType As String
    On Error Resume Next
    locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\myDocs")
    Select Case Application.Version
        Case Is < 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
        Case Is >= 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
    End Select
    Application.ScreenUpdating = False
    Set fs = CreateObject("Scripting.FileSystemObject")
    Set oFolder = fs.GetFolder(locFolder)
    Set tFolder = fs.CreateFolder(locFolder & "Converted")
    Set tFolder = fs.GetFolder(locFolder & "Converted")
    For Each oFile In oFolder.Files
        Dim d As Document
        Set d = Application.Documents.Open(oFile.Path)
        strDocName = ActiveDocument.Name
        intPos = InStrRev(strDocName, ".")
        strDocName = Left(strDocName, intPos - 1)
        ChangeFileOpenDirectory tFolder
        Select Case fileType
        Case Is = "TXT"
            strDocName = strDocName & ".txt"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
        Case Is = "RTF"
            strDocName = strDocName & ".rtf"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
        Case Is = "HTML"
            strDocName = strDocName & ".html"
            ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
        Case Is = "PDF"
            strDocName = strDocName & ".pdf"

            ' *** Word 2007 users - remove the apostrophe at the start of the next line ***
            'ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF

        End Select
        d.Close
        ChangeFileOpenDirectory oFolder
    Next oFile
    Application.ScreenUpdating = True
End Sub

1 个答案:

答案 0 :(得分:1)

我将使用VBA脚本覆盖一种方式来执行您想要的操作,而无需使用Word内置的“从任何文件中恢复文本”模式功能。

它将一个目录中的每个.doc / .docx转换为.txt,但可以用于转换为父应用程序支持的任何其他格式(我使用Word 2010测试过)。如下:

'------------ VBA script start -------------
Sub one1()
Set fs = CreateObject("Scripting.FileSystemObject")
Set list1 = fs.GetFolder(ActiveDocument.Path)
For Each fl In list1.files
  If InStr(fl.Type, "Word") >= 1 And Not fl.Path = ActiveDocument.Path & "\" & ActiveDocument.Name Then
    Set wordapp = CreateObject("word.Application")
    Set Doc1 = wordapp.Documents.Open(fl.Path)
    'wordapp.Visible = True
    Doc1.SaveAs2 FileName:=fl.Name & ".txt", fileformat:=wdFormatText
    wordapp.Quit
  End If
Next
End Sub
'------------ VBA script start -------------

要另存为PDF,请使用

Doc1.SaveAs2 FileName:=fl.Name & ".pdf", fileformat:=wdFormatPDF

代替

保存为RTF,请使用

Doc1.SaveAs2 FileName:=fl.Name & ".rtf", fileformat:=wdFormatRTF 

代替

或者说,HTML:

Doc1.SaveAs2 FileName:=fl.Name & ".html", fileformat:=wdFormatHTML

等等。

我没有费心检查的一些缺点,因为它们是无害的:

  • 在执行结束时会弹出一条错误消息,但不会产生任何后果。

  • 它试图打开自己,因为它是文档本身内部的VBA脚本,而且它是一个文档开启脚本。然后,当弹出一条消息时,你必须指示“他”以只读方式打开它。

  • 它会将所有文档保存到C:\ users \ username \ Documents中,而不是从中执行的文档,在大多数情况下会更好。

  • 过程缓慢,预计大多数普通个人电脑的速度为2-3档/秒。