基本Powershell - 批量转换Word Docx为PDF

时间:2013-05-14 02:44:24

标签: powershell powershell-v2.0 powershell-v3.0

我正在尝试使用PowerShell将Word Docx批量转换为PDF - 使用此站点上的脚本: http://blogs.technet.com/b/heyscriptingguy/archive/2013/03/24/weekend-scripter-convert-word-documents-to-pdf-files-with-powershell.aspx

# Acquire a list of DOCX files in a folder
$Files=GET-CHILDITEM "C:\docx2pdf\*.DOCX"
$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION

Foreach ($File in $Files) {
    # open a Word document, filename from the directory
    $Doc=$Word.Documents.Open($File.fullname)

    # Swap out DOCX with PDF in the Filename
    $Name=($Doc.Fullname).replace("docx","pdf")

    # Save this File as a PDF in Word 2010/2013
    $Doc.saveas([ref] $Name, [ref] 17)  
    $Doc.close()
}

我继续收到此错误,无法找出原因:

PS C:\docx2pdf> .\docx2pdf.ps1
Exception calling "SaveAs" with "16" argument(s): "Command failed"
At C:\docx2pdf\docx2pdf.ps1:13 char:13
+     $Doc.saveas <<<< ([ref] $Name, [ref] 17)
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : DotNetMethodException

有什么想法吗?

另外 - 我如何更改它以转换doc(而不是docX)文件,以及使用本地文件(与脚本位置位于同一位置的文件)?

抱歉 - 从未完成过PowerShell脚本...

4 个答案:

答案 0 :(得分:49)

这适用于doc和docx文件。

$documents_path = 'c:\doc2pdf'

$word_app = New-Object -ComObject Word.Application

# This filter will find .doc as well as .docx documents
Get-ChildItem -Path $documents_path -Filter *.doc? | ForEach-Object {

    $document = $word_app.Documents.Open($_.FullName)

    $pdf_filename = "$($_.DirectoryName)\$($_.BaseName).pdf"

    $document.SaveAs([ref] $pdf_filename, [ref] 17)

    $document.Close()
}

$word_app.Quit()

答案 1 :(得分:3)

这对我有用(Word 2007):

$wdFormatPDF = 17
$word = New-Object -ComObject Word.Application
$word.visible = $false

$folderpath = Split-Path -parent $MyInvocation.MyCommand.Path

Get-ChildItem -path $folderpath -recurse -include "*.doc" | % {
    $path =  ($_.fullname).substring(0,($_.FullName).lastindexOf("."))
    $doc = $word.documents.open($_.fullname)
    $doc.saveas($path, $wdFormatPDF) 
    $doc.close()
}

$word.Quit()

答案 2 :(得分:3)

上面的答案对我来说都不尽如人意,因为我正在做一个批处理工作,以这种方式转换大约70,000个单词文档。事实证明,反复这样做最终导致Word崩溃,可能是由于内存问题(错误是一些COMException,我不知道如何解析)。所以,我要让它继续下去就是每100个文档(任意选择的数字)杀死并重新启动单词。

此外,当它偶尔崩溃时,会产生格式错误的pdf,每个pdf的大小通常为1-2 kb。因此,当跳过已经生成的pdfs时,我确保它们的大小至少为3kb。如果您不想跳过已生成的PDF,则可以删除该if语句。

对不起,如果我的代码看起来不太好,我一般都不会使用Windows,这是一次性黑客攻击。所以,这是结果代码:

$Files=Get-ChildItem -path '.\path\to\docs' -recurse -include "*.doc*"

$counter = 0
$filesProcessed = 0
$Word = New-Object -ComObject Word.Application

Foreach ($File in $Files) {
    $Name="$(($File.FullName).substring(0, $File.FullName.lastIndexOf("."))).pdf"
    if ((Test-Path $Name) -And (Get-Item $Name).length -gt 3kb) {
        echo "skipping $($Name), already exists"
        continue
    }

    echo "$($filesProcessed): processing $($File.FullName)"
    $Doc = $Word.Documents.Open($File.FullName)
    $Doc.SaveAs($Name, 17)
    $Doc.Close()
    if ($counter -gt 100) {
        $counter = 0
        $Word.Quit()
        [System.Runtime.Interopservices.Marshal]::ReleaseComObject($Word)
        $Word = New-Object -ComObject Word.Application
    }
    $counter = $counter + 1
    $filesProcessed = $filesProcessed + 1
}

答案 3 :(得分:1)

此处发布的解决方案都不适用于Windows 8.1(顺便说一句,我使用的是Office 365)。我的PowerShell以某种方式不喜欢[ref]参数(我不知道为什么,我很少使用PowerShell)。

这是对我有用的解决方案:

$Files=Get-ChildItem 'C:\path\to\files\*.docx'

$Word = New-Object -ComObject Word.Application

Foreach ($File in $Files) {
    $Doc = $Word.Documents.Open($File.FullName)
    $Name=($Doc.FullName).replace('docx', 'pdf')
    $Doc.SaveAs($Name, 17)
    $Doc.Close()
}