我正在尝试在一组文档中返回字符串模式中找到的最高 4位数字。
字符串模式:3个字母短划线4位数
word文档中包含文档标识符代码,如下所示。
示例文件:
Car Parts.docx> CPW - 2345
CarHandles.docx> CPW - 8723
CarList.docx> CPA - 9083
我引用了我想要调整的示例代码。我不是VBA或PowerShell程序员 - 所以我可能错了我想做的事情?
我很高兴看到Windows平台上的替代方案。
我引用了这个来让我开始
http://chris-nullpayload.rhcloud.com/2012/07/find-and-replace-string-in-all-docx-files-recursively/
PowerShell: return the number of instances find in a file for a search pattern
Powershell: return filename with highest number
$list = gci "C:\Users\WP\Desktop\SearchFiles" -Include *.docx -Force -recurse
foreach ($foo in $list) {
$objWord = New-Object -ComObject word.application
$objWord.Visible = $False
$objDoc = $objWord.Documents.Open("$foo")
$objSelection = $objWord.Selection
$Pat1 = [regex]'[A-Z]{3}-[0-9]{4}' # Find the regex match 3 letters followed by 4 numbers eg HGW - 1024
$findtext= "$Pat1"
$highestNumber =
# Find the highest occurrence of this pattern found in the documents searched - output to text file or on screen
Sort-Object | # This may also be wrong -I added it for when I find the pattern
Select-Object -Last 1 -ExpandProperty Name
<# The below may not be needed - ?
$ReplaceText = ""
$ReplaceAll = 2
$FindContinue = 1
$MatchFuzzy = $False
$MatchCase = $False
$MatchPhrase = $false
$MatchWholeWord = $True
$MatchWildcards = $True
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False
$objSelection.Find.execute(
$FindText,
$MatchCase,
$MatchWholeWord,
$MatchWildcards,
$MatchSoundsLike,
$MatchAllWordForms,
$Forward,
$Wrap,
$Format,
$ReplaceText,
$ReplaceAll
}
}
#>
我感谢任何有关如何进行的建议 -
答案 0 :(得分:2)
试试这个:
template <typename EnumType> class LazyInitSingleton
{
public:
static EnumWithString<EnumType>& getInstance(const EnumWithString<EnumType>::BMEnumType& arg)
{
static EnumWithString<EnumType> theInstance(arg);
return theInstance;
}
};
LazyInitSingleton<ColorType>::getInstance(...);
这背后的主要思想不是围绕Word的COM api,而是尝试手动从文档中提取文本信息。
答案 1 :(得分:0)
获得最高数字的方法是首先使用正则表达式将其隔离,然后排序并选择第一项。像这样:
[regex]::matches($objSelection, '(?<=[A-Z]{3}\s*-\s*)\d{4}') `
| Select -ExpandProperty captures `
| sort value -Descending `
| Select -First 1 -ExpandProperty value `
| Add-Content outfile.txt
我认为您使用正则表达式时遇到的问题是,您的示例数据在代码中包含了短划线周围的空格,而这在您的模式中是不允许的。