请查看this previous question of mine。我正在努力实现类似的东西,但这次采用了更高级的标准。
简而言之,我需要在其父<NETTOTAL>
下添加子节点(XML标记)。子节点文本内容由从同一XML文件中提取的8位数字组成。这些数字正在被提取并存储在一个数组中供以后处理,您将在下面的脚本中看到。
现有脚本有效,但我怀疑循环逻辑是错误的。我需要它来挑选和放置一个XML标签,在每个父节点下面有相应的8位数字,而不是选择,循环,并放置相同的孩子。
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EXPORT>
<IMPORTMODEL>NEX</IMPORTMODEL>
<SESSION>1000061</SESSION>
<CUSTORDERS>
<RECORD CODE="NX0100103">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>97.40</NETTOTAL>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName1 lastName1 (43700006)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100104">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>38.20</NETTOTAL>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100105">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>63.00</NETTOTAL>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName3 lastName3 (43100014)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100106">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>55.00</NETTOTAL>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
</CUSTORDERS>
</EXPORT>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EXPORT>
<IMPORTMODEL>NEX</IMPORTMODEL>
<SESSION>1000061</SESSION>
<CUSTORDERS>
<RECORD CODE="NX0100103">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>97.40</NETTOTAL>
<SALESMAN>43700006</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName1 lastName1 (43700006)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100104">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>38.20</NETTOTAL>
<SALESMAN>43100015</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100105">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>63.00</NETTOTAL>
<SALESMAN>43100014</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName3 lastName3 (43100014)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100106">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>55.00</NETTOTAL>
<SALESMAN>43100015</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
</CUSTORDERS>
</EXPORT>
$xmlFilesLocation = "C:\XML_dumping"
cd $xmlFilesLocation
$netTotalRegEx = "(<NETTOTAL>\d{1,30}\.\d{1,2}<\/NETTOTAL>)"
$salesManRegEx = "(<SALESMAN>\d{8}<\/SALESMAN>)"
$beginTag = "`t`t`t<SALESMAN>"
$endTag = "</SALESMAN>"
$files = Get-ChildItem -Path $xmlFilesLocation -Filter *.xml
$numberOfFiles = (Get-ChildItem -Path $xmlFilesLocation -Filter *.xml | Measure-Object).Count
# First, loop through all files separately to check if <SALESMAN>[code]</SALESMAN> exists, and skip if true
for ($i=1; $i -le $numberOfFiles; $i++) {
$content = (Get-Content $files[$i - 1] -Raw)
# Skip file if <SALESMAN>[code]</SALESMAN> is detected in it
if ($content -match $salesManRegEx) { break }
}
# Then, loop through all files (again) separately to check if <SALESMAN>[code]</SALESMAN> is missing, and process if true
for ($j=1; $j -le $numberOfFiles; $j++) {
$content = (Get-Content $files[$j - 1] -Raw)
# If <SALESMAN>[code]</SALESMAN> is missing in the file
if ($content -notmatch $salesManRegEx) {
$contentArray = @()
# Hold all the content, but split from the brackets
$contentArray = $content
$contentArray = $contentArray.Split("()")
# Now split by line to extract the salesman codes into an array.
# Example: [43700006, 43100015, 43100014, 43100015]
$contentArray = $contentArray.Split("")
for ($k=1; $k -le $contentArray.Length; $k++) {
# if the salesman code is found...
if ($contentArray[$k] -match "^\d{8}$") {
if ($content -notmatch $salesManRegEx) {
# Construct the full tag
$fullSalesManTag = $beginTag + $contentArray[$k] + $endTag
# ...then replace in $content the regular expression with $fullSalesManTag and insert it directly underneath NETTOTAL line
$content= [regex]::Replace($content, $netTotalRegEx, ('$1' + "`n" + "$fullSalesManTag"))
$content | Out-File -Encoding UTF8 $files[$j - 1]
}
}
}
}
}
输出显示它只添加了数组索引中的最后一个元素。这就是循环结束的时候。我理解为什么会发生这种情况,但我无法解决纠正逻辑的解决方案。
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EXPORT>
<IMPORTMODEL>NEX</IMPORTMODEL>
<SESSION>1000061</SESSION>
<CUSTORDERS>
<RECORD CODE="NX0100103">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>97.40</NETTOTAL>
<SALESMAN>43700006</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName1 lastName1 (43700006)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100104">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>38.20</NETTOTAL>
<SALESMAN>43700006</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100105">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>63.00</NETTOTAL>
<SALESMAN>43700006</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName3 lastName3 (43100014)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
<RECORD CODE="NX0100106">
<VATMODE>X</VATMODE>
<INPUTDATE>26/07/2017</INPUTDATE>
<NETTOTAL>55.00</NETTOTAL>
<SALESMAN>43700006</SALESMAN>
<DOCLINES>
<LINE>
<LINETYPE>M</LINETYPE>
<ITEMDESC>Salesperson: firstName2 lastName2 (43100015)</ITEMDESC>
</LINE>
</DOCLINES>
</RECORD>
</CUSTORDERS>
</EXPORT>
答案 0 :(得分:4)
Do not parse XML with regex。每当你做彩虹独角兽死亡。
但严重的是,在大多数情况下,正则表达式是使用XML文件的错误工具。如果您感兴趣,this question的答案(感谢kjhughes的链接)深入探讨了正则表达式方法的问题。
使用正确的XML解析器和一对XPath expressions来提取销售员ID并将其添加为新节点:
$xmlfile = 'C:\path\to\your.xml'
[xml]$xml = Get-Content $xmlfile
$xml.SelectNodes('//RECORD') | ForEach-Object {
$id = $_.SelectSingleNode('.//ITEMDESC').'#text' -replace '.*\((\d+)\).*', '$1'
$sibling = $_.SelectSingleNode('./NETTOTAL')
$node = $xml.CreateElement('SALESMAN')
$node.InnerText = $id
$_.InsertAfter($node, $sibling)
}
$xml.Save($xmlfile)