我有一个脚本,可将Word文档拆分为单独的Word文档。每个拆分文档为1页。
需要进行哪些修改,才能像递归命名每个拆分文档 Dream_File01.docx,Dream_File02.docx,Dream_File03.docx,Dream_File04.docx,Dream_File05.docx,Dream_File06.docx等。
## -- Settings --
#$fileNamePattern = "ID #:\s+(\d+)"
$fileNamePattern = "Student ID #:\s+# (\d+)"
$pageLength = 1
$inputFile = "Dream_File.docx"
$outputPath = "outputDir\" #End the path with a slash
## -- End Settings
[ref]$SaveFormat = "microsoft.office.interop.word.WdSaveFormat" -as [type]
$word = New-Object -ComObject word.application
$word.Visible = $true
$doc = $word.Documents.Open($inputFile)
$pages = $doc.ComputeStatistics([Microsoft.Office.Interop.Word.WdStatistic]::wdStatisticPages)
$rngPage = $doc.Range()
for($i=1;$i -le $pages; $i+=$pageLength)
{
[Void]$word.Selection.GoTo([Microsoft.Office.Interop.Word.WdGoToItem]::wdGoToPage,
[Microsoft.Office.Interop.Word.WdGoToDirection]::wdGoToAbsolute,
$i #Starting Page
)
$rngPage.Start = $word.Selection.Start
[Void]$word.Selection.GoTo([Microsoft.Office.Interop.Word.WdGoToItem]::wdGoToPage,
[Microsoft.Office.Interop.Word.WdGoToDirection]::wdGoToAbsolute,
$i+$pageLength #Next page Number
)
$rngPage.End = $word.Selection.Start
$marginTop = $word.Selection.PageSetup.TopMargin
$marginBottom = $word.Selection.PageSetup.BottomMargin
$marginLeft = $word.Selection.PageSetup.LeftMargin
$marginRight = $word.Selection.PageSetup.RightMargin
$rngPage.Copy()
$newDoc = $word.Documents.Add()
$word.Selection.PageSetup.TopMargin = $marginTop
$word.Selection.PageSetup.BottomMargin = $marginBottom
$word.Selection.PageSetup.LeftMargin = $marginLeft
$word.Selection.PageSetup.RightMargin = $marginRight
$word.Selection.Paste() # Now we have our new page on a new doc
$word.Selection.EndKey(6,0) #Move to the end of the file
$word.Selection.TypeBackspace() #Seems to grab an extra section/page break
$word.Selection.Delete() #Now we have our doc down to size
#Get Name
$regex = [Regex]::Match($rngPage.Text, $fileNamePattern)
if($regex.Success)
{
$id = $regex.Groups[1].Value
}
else
{
$id = "patternNotFound_" + $i
}
$path = $outputPath + $id + ".docx"
$newDoc.saveas([ref] $path, [ref]$SaveFormat::wdFormatDocumentDefault)
$newDoc.close()
Remove-Variable(regex)
Remove-Variable(id)
}
[gc]::collect()
[gc]::WaitForPendingFinalizers()
答案 0 :(得分:1)
如果您想要的输出是一系列名为Dream_File_01.docx
,Dream_File_02.docx
等的Word文档,那么我确实不明白您为什么要使用正则表达式来获取StudentID
。
正如Lee_Dailey所说,$i
变量中已经存在顺序命名所需的页面计数器。
无论如何,这是我对脚本的修改。请阅读我在其中添加的评论,以便您可以选择是否使用StudentID
:
## -- Settings --
#$fileNamePattern = "ID #:\s+(\d+)"
$fileNamePattern = "Student ID #:\s+# (\d+)"
$pageLength = 1
$inputFile = "Dream_File.docx"
$outputPath = "outputDir" # Use Join-Path so don't worry about it not ending with a backslash.
## -- End Settings
[ref]$SaveFormat = "Microsoft.Office.Interop.Word.WdSaveFormat" -as [type]
$word = New-Object -ComObject Word.Application
$word.Visible = $true
$doc = $word.Documents.Open($inputFile)
$pages = $doc.ComputeStatistics([Microsoft.Office.Interop.Word.WdStatistic]::wdStatisticPages)
$rngPage = $doc.Range()
for($i = 1; $i -le $pages; $i += $pageLength) {
[void]$word.Selection.GoTo([Microsoft.Office.Interop.Word.WdGoToItem]::wdGoToPage,
[Microsoft.Office.Interop.Word.WdGoToDirection]::wdGoToAbsolute,
$i #Starting Page
)
$rngPage.Start = $word.Selection.Start
[void]$word.Selection.GoTo([Microsoft.Office.Interop.Word.WdGoToItem]::wdGoToPage,
[Microsoft.Office.Interop.Word.WdGoToDirection]::wdGoToAbsolute,
$i+$pageLength #Next page Number
)
$rngPage.End = $word.Selection.Start
$marginTop = $word.Selection.PageSetup.TopMargin
$marginBottom = $word.Selection.PageSetup.BottomMargin
$marginLeft = $word.Selection.PageSetup.LeftMargin
$marginRight = $word.Selection.PageSetup.RightMargin
$rngPage.Copy()
$newDoc = $word.Documents.Add()
$word.Selection.PageSetup.TopMargin = $marginTop
$word.Selection.PageSetup.BottomMargin = $marginBottom
$word.Selection.PageSetup.LeftMargin = $marginLeft
$word.Selection.PageSetup.RightMargin = $marginRight
$word.Selection.Paste() # Now we have our new page on a new doc
$word.Selection.EndKey(6,0) # Move to the end of the file
$word.Selection.TypeBackspace() # Seems to grab an extra section/page break
$word.Selection.Delete() # Now we have our doc down to size
# This part I don't fully understand..
# Why do you want to get the Student ID number here? Should that be part of the filename?
# Can you be sure that on every page this number can be found?
$regex = [Regex]::Match($rngPage.Text, $fileNamePattern)
if($regex.Success) {
# Get the filename without extension from the $inputFile string and append the student ID, the pagecounter '$i' and '.docx' extension to it
$newFileName = '{0}_{1}_{2:00}.docx' -f [System.IO.Path]::GetFileNameWithoutExtension($inputFile), $regex.Groups[1].Value, $i
}
else {
# Get the filename without extension from the $inputFile string and append 'patternNotFound', the pagecounter '$i' and '.docx' extension to it
$newFileName = '{0}_patternNotFound_{1:00}.docx' -f [System.IO.Path]::GetFileNameWithoutExtension($inputFile), $i
}
# If the Student ID is not needed in the filename, use this instead:
# Get the filename without extension from the $inputFile string and append the pagecounter '$i' and '.docx' extension to it
# $newFileName = '{0}_{1:00}.docx' -f [System.IO.Path]::GetFileNameWithoutExtension($inputFile), $i
# next combine the output path with the new filename for Save()
$path = Join-Path -Path $outputPath -ChildPath $newFileName
$newDoc.SaveAs([ref] $path, [ref]$SaveFormat::wdFormatDocumentDefault)
$newDoc.Close()
}
# you're done, exit Word and clean up the Com object
$word.Quit()
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
希望有帮助