我正在使用以下powershell脚本打开几千个HTML文件并“另存为...”Word文档。
param([string]$htmpath,[string]$docpath = $docpath)
$srcfiles = Get-ChildItem $htmPath -filter "*.htm*"
$saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat], "wdFormatDocument");
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-document
{
$opendoc = $word.documents.open($doc.FullName);
$opendoc.saveas([ref]"$docpath\$doc.FullName.doc", [ref]$saveFormat);
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-document
$doc = $null
}
$word.quit();
内容很精彩,但我的文件名不符合预期。
$opendoc.saveas([ref]"$docpath\$doc.FullName.doc", [ref]$saveFormat);
导致foo.htm
保存为foo.htm.FullName.doc
而不是foo.doc
。
$opendoc.saveas([ref]"$docpath\$doc.BaseName.doc", [ref]$saveFormat);
收益foo.htm.BaseName.doc
如何将Save As...
文件名变量设置为等于BaseName
和.doc
的串联?
答案 0 :(得分:0)
根据我们上面的评论,似乎只需要移动文件即可。以下适用于我。在当前目录中,它用.py扩展名替换.txt扩展名。我找到了命令here。
PS C:\testing dir *.txt | Move-Item -Destination {[IO.Path]::ChangeExtension( $_.Name, "py")}
您也可以将*.txt
更改为C:\path\to\file\*.txt
,这样就无需从文件位置执行此行。您应该能够以类似的方式定义目的地,所以如果我找到一种简单的方法,我会报告。
此外,我在搜索时找到了Microsoft的TechNet库。它有很多关于使用PowerShell编写脚本的教程。 Files and Folders, Part 3: Windows PowerShell应该可以帮助您找到有关复制和移动文件的其他信息。
答案 1 :(得分:0)
我在将文件名从.html
转换为.docx
时遇到了问题。我把你的代码上面改为:
function Convert-HTMLtoDocx {
param([string]$htmpath)
$srcfiles = Get-ChildItem $htmPath -filter "*.htm*"
$saveFormat = [Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatXMLDocument
$word = new-object -comobject word.application
$word.Visible = $False
ForEach ($doc in $srcfiles) {
Write-Host "Processing :" $doc.fullname
$name = Join-Path -Path $doc.DirectoryName -ChildPath $($doc.BaseName + ".docx")
$opendoc = $word.documents.open($doc.FullName)
$opendoc.saveas([ref]$name.Value,[ref]$saveFormat)
$opendoc.close()
$doc = $null
} #End ForEach
$word.quit()
} #End Function
问题是保存格式。无论出于何种原因,将文档另存为.docx
,您需要在wdFormatXMLDocument
而不是wdFormatDocument
指定格式。
答案 2 :(得分:0)
$docpath = "\\sf-xyz-serverabc01\ChangeTheseDocuments"
$WdTypes = Add-Type -AssemblyName 'Microsoft.Office.Interop.Word, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c' -Passthru
$srcfiles = get-childitem $docpath -filter "*.doc" -rec | where {!$_.PSIsContainer} | select-object FullName
$saveFormat = $WdTypes | Where {$_.Name -eq 'WdSaveFormat'}
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-filteredhtml
{
$opendoc = $word.documents.open($doc.FullName);
$Name=($doc.Fullname).replace("doc","htm")
$opendoc.saveas([ref]$Name, [ref]$saveFormat::wdFormatFilteredHTML);
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-filteredhtml
$doc = $null
}
$word.quit();
答案 3 :(得分:0)
我知道这是一篇较旧的帖子,但我在这里发布此代码,以便将来可以找到它
**
**
以下是您可以保存的不同格式的LINK。
$docpath = "C:\Temp"
$WdTypes = Add-Type -AssemblyName 'Microsoft.Office.Interop.Word, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c' -Passthru
$srcfiles = get-childitem $docpath -filter "*.doc" -rec | where {!$_.PSIsContainer} | select-object FullName
$saveFormat = $WdTypes | Where {$_.Name -eq 'WdSaveFormat'}
$word = new-object -comobject word.application
$word.Visible = $False
function saveas-filteredhtml
{
$opendoc = $word.documents.open($doc.FullName);
$Name=($doc.Fullname).replace(".docx",".txt").replace(".doc",".txt")
$opendoc.saveas([ref]$Name, [ref]$saveFormat::wdFormatDOSText); ##wdFormatDocument
$opendoc.close();
}
ForEach ($doc in $srcfiles)
{
Write-Host "Processing :" $doc.FullName
saveas-filteredhtml
$doc = $null
}
$word.quit();