Powershell:仅当缺少VATMODE标记时才会高级插入XML文件

时间:2017-08-11 16:38:53

标签: xml powershell

完整解决方案

# Description: Adds <VATMODE>X</VATMODE> XML tags to files arriving from server, underneath each RECORD CODE line.
# Script tested and works using:
#   - Powershell v5.1 on Windows 10 Pro
#   - Powershell v4.0 on Windows Server 2008 R2.
#   - Does NOT work on Powershell v2.0

# References
# My own question: https://stackoverflow.com/questions/45639945/powershell-advanced-insert-into-xml-files-only-if-vatmode-tag-is-missing
# https://stackoverflow.com/questions/31678072/insert-content-into-specific-place-in-text-file-in-powershell
# https://stackoverflow.com/questions/1875617/insert-content-into-text-file-in-powershell
# https://social.technet.microsoft.com/wiki/contents/articles/4310.powershell-working-with-regular-expressions-regex.aspx
# http://blog.danskingdom.com/fix-problem-where-windows-powershell-cannot-run-script-whose-path-contains-spaces/
# https://community.spiceworks.com/topic/857690-automatically-and-silently-bypass-execution-policy-for-a-powershell-script
# http://leelusoft.blogspot.com.ng/p/watch-4-folder-25.html
# References

# Assign the directory where the XML files arrive from the server
$xmlFilesLocation = "C:\XML_dumping\"

# Change directory. Without this, the script will run in the same directory that the script is located at, and that's wrong
cd $xmlFilesLocation

# Show the directory so we can easily look at what's going on. Comment this out if it becomes annoying.
Invoke-Item $xmlFilesLocation

# Regular expression to match RECORD CODE lines
$regEx = "(\W\w{6}\s\w{4}\W.+)"

# A String variable which contains the VATMODE XML tag
$vatModeExists = "<VATMODE>X</VATMODE>"

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML file names in the directory
$files = Get-ChildItem -Path $xmlFilesLocation -Filter *.xml

# Count the number of all XML files in the directory
$numberOfFiles = (Get-ChildItem -Path $xmlFilesLocation -Filter *.xml | Measure-Object).Count

# First, loop through all files separately to check if <VATMODE>X</VATMODE> exists, and skip if true
for ($i=1; $i -le $numberOfFiles; $i++) {

    # Scan the contents of each file
    $content = (Get-Content $files[$i - 1] -raw)

    # If <VATMODE>X</VATMODE> is detected in the file...
    if ($content -match $vatModeExists) {
        # ...then do not process the file (skip it)
        break
    }
}

# Then, loop through all files (again) separately to check if <VATMODE>X</VATMODE> is missing, and process if true
for ($j=1; $j -le $numberOfFiles; $j++) {

    # Scan the contents of each file
    $content = (Get-Content $files[$j - 1] -raw)

    # If <VATMODE>X</VATMODE> is missing in the file...
    if ($content -notmatch $vatModeExists) {

        # ...then replace in $content the regular expression with $vatModeTag and insert it directly underneath RECORD CODE line
        $content= [regex]::replace($content, $regEx, ('$1'+"`n"+"$vatModeTag"))

        # Save the file that now has the new $vatModeTag and output it
        $content | Out-File -encoding utf8 $files[$j - 1]
    }
}

问题陈述

我正在尝试实现与this类似的功能,但增加了复杂性。这些是每天从服务器到达的XML文件,并放入单个文件夹以导入到会计系统中。会计系统不会导入文件,除非每个<VATMODE>X</VATMODE>父项下都有子RECORD CODE。这些XML文件有两种可能性:逐个或分批。它们具有不同的名称,具有连续递增的数字和不同的前缀。例如:NX1000060.xmlNX1000061.xmlABN000028.xml,依此类推。

Powershell脚本

# Regex to match RECORD CODE lines
$regEx = "\W\w{6}\s\w{4}\W.+"

#Regex to match exactly <VATMODE>X</VATMODE>
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML files in the directory
$files = Get-ChildItem -Path "C:\XML_dumping" -Filter *.xml

# Get the number of XML files in the directory
$numberOfFiles = (Get-ChildItem -Path "C:\XML_dumping" -Filter *.xml | Measure-Object).Count

for ($i=1; $i -lt $numberOfFiles; $i++) { # Loop through each file separately
    $content = (Get-Content $files[$i - 1]) # Scan the contents of each file
    if ($content -match $vatModeExists) { # If <VATMODE>X</VATMODE> is detected in the file...
        break # ...then do not process the file (skip it)
    }

    # Get the matched RECORD CODE lines
    $found = $content -match $regEx
    for ($j=0; $j -lt $found.Length; $j++ ) { # Loop through each matched RECORD CODE line
        echo  $found[$j] $vatModeTag # Insert <VATMODE>X</VATMODE> right under RECORD CODE line
        # save the files that now have VATMODE inserted into them, but how?
    }
}

上面的脚本应该在每个RECORD CODE行下附加VATMODE标记,如下面的输出所示。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EXPORT>
    <IMPORTMODEL>NEX</IMPORTMODEL>
    <SESSION>1000060</SESSION>
    <CUSTORDERS>
        <RECORD CODE="NX0100096">
        <VATMODE>X</VATMODE>
        <INPUTDATE>19/07/2017</INPUTDATE>
        <!--...and so on...-->

在Powershell ISE中,脚本运行正常(用于我的视觉检查),但如何插入VATMODE并保存我已添加VATMODE的文件?

的伪代码

  1. 分配正则表达式
  2. 分配VATMODE标记
  3. 获取文件列表
  4. 获取文件数
  5. 分别获取每个文件的内容
  6. 检查VATMODE是否已存在并中断
  7. 另外附加VATMODE
  8. 保存获得新VATMODE的文件

2 个答案:

答案 0 :(得分:1)

我使用[regex]::replace,它对我有用。正则表达式中的括号用于检索$ 1中的值。此外,我在-lt循环中的-le替换了代码for

# Regex to match RECORD CODE lines
$regEx = "\W\w{6}\s\w{4}\W.+"
$regExParen = "(\W\w{6}\s\w{4}\W.+)"
#Regex to match exactly <VATMODE>X</VATMODE>
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with three tabs for proper indentation
$vatModeTag = "`t`t`t<VATMODE>X</VATMODE>"

# Get all XML files in the directory
$files = Get-ChildItem -Path "C:\Users\user1\Documents\XML_dumping" -Filter *.xml

# Get the number of XML files in the directory
$numberOfFiles = (Get-ChildItem -Path "C:\Users\user1\Documents\XML_dumping" -Filter *.xml | Measure-Object).Count
for ($i=1; $i -le $numberOfFiles; $i++) { # Loop through each file separately
    $content = (Get-Content $files[$i - 1] -raw) # Scan the contents of each file

    if ($content -match $vatModeExists) { 
    # If <VATMODE>X</VATMODE> is detected in the file...
    echo "<VATMODE>X</VATMODE>"
    break # ...then do not process the file (skip it)

}
    # replaces in $content the reg. expression with VATNUMBER
    $content= [regex]::replace($content, $regExParen, ('$1'+"`r`n"+"VATNUMBER"+"`r`n")) 
    # Insert <VATMODE>X</VATMODE> right under RECORD CODE line
    echo $content
        # save the files that now have VATMODE inserted into them, but how?
    $content | Out-File -encoding utf8 $files[$i - 1]
}

答案 1 :(得分:0)

所以我试着拿出一些东西。但与你要求的有点不同。

# Assign the path where the XML files are getting dumped as they arrive from the server
$fileName = "*.xml"

# Assign the regular expression patterns
$regEx = "\W\w{6}\s\w{4}\W.+"
$vatModeExists = "\W\w{7}.\w\W{2}\w{7}."

# Assign the VATMODE tag, preceding it with a line break and three tabs for proper indentation
$vatModeTag = "`n`t`t`t<VATMODE>X</VATMODE>"

$Output
foreach($file in $fileName){
    if ((Get-Content $file) -notmatch $vatModeExists){
        if($file -match $regex) { # if RECORD CODE line is found
            $file += $vatModeTag # append VATMODE after each RECORD CODE line
        }
        $output += $file
    }
}
Set-Content -path SomeFile.xml -Value $output