我需要为特定文本检查大量的word文档(doc& docx),并找到了脚本专家的精彩教程和脚本;
脚本读取目录中的所有文档并提供以下输出;
这就是我所需要的,但是他们的代码似乎并没有真正检查任何文档的标题,顺便提一下,我正在查找的特定文本所在的位置。任何提示&使脚本读取标题文本的技巧会让我非常高兴。
另一种解决方案可能是删除格式,以便标题文本成为文档其余部分的一部分?这可能吗?
编辑:忘记链接脚本:
[cmdletBinding()]
Param(
$Path = "C:\Users\use\Desktop\"
) #end param
$matchCase = $false
$matchWholeWord = $true
$matchWildCards = $false
$matchSoundsLike = $false
$matchAllWordForms = $false
$forward = $true
$wrap = 1
$application = New-Object -comobject word.application
$application.visible = $False
$docs = Get-childitem -path $Path -Recurse -Include *.docx
$findText = "specific text"
$i = 1
$totalwords = 0
$totaldocs = 0
Foreach ($doc in $docs)
{
Write-Progress -Activity "Processing files" -status "Processing $($doc.FullName)" -PercentComplete ($i /$docs.Count * 100)
$document = $application.documents.open($doc.FullName)
$range = $document.content
$null = $range.movestart()
$wordFound = $range.find.execute($findText,$matchCase,
$matchWholeWord,$matchWildCards,$matchSoundsLike,
$matchAllWordForms,$forward,$wrap)
if($wordFound)
{
$doc.fullname
$document.Words.count
$totaldocs ++
$totalwords += $document.Words.count
} #end if $wordFound
$document.close()
$i++
} #end foreach $doc
$application.quit()
"There are $totaldocs and $($totalwords.tostring('N')) words"
#clean up stuff
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($range) | Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($document) | Out-Null
[System.Runtime.InteropServices.Marshal]::ReleaseComObject($application) | Out-Null
Remove-Variable -Name application
[gc]::collect()
[gc]::WaitForPendingFinalizers()
编辑2:我的同事决定调用节标题;
Foreach ($doc in $docs)
{
Write-Progress -Activity "Processing files" -status "Processing $($doc.FullName)" -PercentComplete ($i /$docs.Count * 100)
$document = $application.documents.open($doc.FullName)
# Load first section of the document
$section = $doc.sections.item(1);
# Load header
$header = $section.headers.Item(1);
# Set the range to be searched to only Header
$range = $header.content
$null = $range.movestart()
$wordFound = $range.find.execute($findText,$matchCase,
$matchWholeWord,$matchWildCards,$matchSoundsLike,
$matchAllWordForms,$forward,$wrap,$Format)
if($wordFound) [script continues as above]
但这会遇到以下错误:
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\count_mod.ps1:27 char:31
+ $section = $doc.sections.item <<<< (1);
+ CategoryInfo : InvalidOperation: (item:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\count_mod.ps1:29 char:33
+ $header = $section.headers.Item <<<< (1);
+ CategoryInfo : InvalidOperation: (Item:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\count_mod.ps1:33 char:26
+ $null = $range.movestart <<<< ()
+ CategoryInfo : InvalidOperation: (movestart:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\count_mod.ps1:35 char:34
+ $wordFound = $range.find.execute <<<< ($findText,$matchCase,
+ CategoryInfo : InvalidOperation: (execute:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
这是正确的方法还是死路一条?
答案 0 :(得分:1)
如果您需要标题文字,可以尝试以下方法:
$document.content.Sections.First.Headers.Item(1).range.text
答案 1 :(得分:0)
对于任何在将来看这个问题的人:某些东西并不适合我上面的代码。它似乎返回一个误报并放置$ wordFound = 1,无论文档的内容如何,因此列出了在$ path下找到的所有文档。
在Find.Execute中编辑变量似乎并没有改变$ wordFound的结果。我相信问题可能出现在我的$ range中,因为它是我在逐步完成代码时遇到错误的唯一地方。
列出的错误;
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\Powershell\count.ps1:24 char:58
+ $range = $document.content.Structures.First.Headers.Item <<<< (1).range.Text
+ CategoryInfo : InvalidOperation: (Item:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull
Exception calling "MoveStart" with "0" argument(s): "The RPC server is unavailable. (Exception from HRESULT: 0x800706BA)"
At C:\Users\user\Desktop\Powershell\count.ps1:25 char:26
+ $null = $range.MoveStart <<<< ()
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : ComMethodCOMException
You cannot call a method on a null-valued expression.
At C:\Users\user\Desktop\Powershell\count.ps1:26 char:34
+ $wordFound = $range.Find.Execute <<<< ($findText,$matchCase,
+ CategoryInfo : InvalidOperation: (Execute:String) [], RuntimeException
+ FullyQualifiedErrorId : InvokeMethodOnNull