我的Powershell脚本的输出有奇怪的字符(编码错误)

时间:2019-12-19 15:12:15

标签: xml powershell encoding

我编写了一个脚本来读取XML文件并在某些特定节点上进行一些编辑,然后将文件写回。

我遇到的问题是输出文件中有一些额外的字符添加到了我未编辑的节点上。

我认为这是一个编码问题。

我脚本中的相关代码是

function getAssigneeID ($assigneeName) {
    #param($assigneeName)
    #write($assigneeName)
    $assigneeID = $nameIDHash[$assigneeName]
    if ($assigneeID -eq $null -or $a -eq "") {
        return 'Not Found'
    } else {
        return $assigneeID
    }
}
function ValidateAssigneeField ($assignee, $fileContent, $fileURI, $assignees) {
    If (($assignee.InnerText.Length -le 2) -or ($assignee.InnerText.Length -ne 6) -or ($assignee.InnerText[1] -ne 'Z')) {
        write("`tAssignee " + $assignee.InnerText + " is invalid.") >> $Output_Log_File
        #find assignee's ID in nameIDHash 
        $assigneeID = getAssigneeID -assigneeName $assignee.InnerText
        if ($assigneeID -eq 'Not Found' -or $assigneeID -eq $null){
            write("`t`tThe ID for the invalid user " + $assignee.InnerText + " is Not Found.") >> $Output_Log_File
        } else {
            #if the assigneeID is in the list of assignees, remove the name, otherwise replace the name with the ID and save the file.
            write("`t`tThe ID for the invalid user " + $assignee.InnerText + " is " + $assigneeID) >> $Output_Log_File
            $assigneeIdAlreadyInList = $false
            foreach ($user in $assignees){
                #write("user = " + $user.InnerText + ", ID = " + $assigneeID)
                If ($user.InnerText -eq $assigneeID){
                    $assigneeIdAlreadyInList = $true
                } else {
                }
            }
            #write ($assigneeIdAlreadyInList)
            if ($assigneeIdAlreadyInList){
                write("`t`t" + $assigneeID + " already exists in the assignee list, removing " + $assignee.InnerText) >> $Output_Log_File
                [void]$assignee.ParentNode.RemoveChild($assignee)
            } else {
                write("`t`tReplacing " + $assignee.InnerText + " with " + $assigneeID + ".") >> $Output_Log_File
                $assignee.InnerText = $assigneeID
            }
            write("`t`tSaving the file " + $fileURI + ".") >> $Output_Log_File
            #$fileContent.save($fileURI)
            #$file | out-file -Encoding "UTF8" -FilePath $fileURI
            #$MyXML | out-file -Encoding "UTF8" -FilePath $fileURI
            #$fileContent | out-file -Encoding "UTF8" -FilePath $fileURI
        }
    } else {
        write("`tAssignee " + $assignee.InnerText + " is OK.") >> $Output_Log_File
    }
}
$workitemBasePath = "C:\temp\dev\workitems\Dev_ECH\"
$Output_Log_File = "C:\ALM\Reports\Dev_ECH - Correct All Assignees.txt"
$NameIDHash = @{
"Jade West" = "zzzzzz"
"Tonya Killebrew" = "AZCJNZ"}
$today = Get-Date -format s
write($today + " - Running Correct invalid Assignees.ps1.") > $Output_Log_File
$files = Get-ChildItem -Path $workitemBasePath -include workitem.xml -Recurse | % { $_.FullName }
foreach ($file in $files){
    write("Evaluating " + $file) >> $Output_Log_File
    [xml]$MyXML = Get-Content $file
    $assigneeList = $MyXML.SelectNodes('//work-item/field[@id="assignee"]/list/item')
    if ($assigneeList.count -eq 0) {
        $assigneeList  = $MyXML.SelectNodes('//work-item/field[@id="assignee"]')
    }
    foreach ($assignee in $assigneeList) {
        ValidateAssigneeField -assignee $assignee -fileContent $MyXML -fileURI $file -assignees $assigneeList
    }

}

然后在ValidateAssigneeField中,对受让人节点进行一些编辑,并使用保存文件

    $fileContent.save($fileURI)

在输出XML文件中,我看到以下一些额外的字符被添加到某些文本字段中。

  <field id="description" text-type="text/plain">​Navistar has reported that the transmission remains in Drive when the operator selects a fast sequence from Drive to Reverse to Manual mode. When selecting a similar sequence from Reverse to Drive to Manual mode, the transmission drive as expected.</field>

–和Â添加在看似随机的位置。

我假设我需要找出原始XML的编码格式,然后以相同格式输出经过编辑的XML。

如何更改$ fileContent.save($ fileURI)命令的输出格式?

<?xml version="1.0" encoding="UTF-8"?>
<work-item>
    <field id="assignee">Jade West</field>
    <field id="author">RZPRRK</field>
    <field id="created">2019-08-08 10:41:39.163 -0400</field>
    <field id="description" text-type="text/html">Tst</field>
    <field id="dueDate">2019-08-05</field>
    <field id="nextReviewDate" type="date">2019-08-15</field>
    <field id="osNumber" type="string">23457</field>
    <field id="osOpenDate" type="date">2019-07-30</field>
    <field id="previousStatus">toBeScreened</field>
    <field id="priority">2.0</field>
    <field id="rational" text-type="text/html" type="text/html">Test</field>
    <field id="release" type="enum:release">na</field>
    <field id="resolution">duplicate</field>
    <field id="resolvedOn">2019-08-08 10:42:22.987 -0400</field>
    <field id="severity">normal</field>
    <field id="status">inProcess</field>
    <field id="title">HWCR - Reject</field>
    <field id="type">hardwareChangeRequest</field>
</work-item>

<?xml version="1.0" encoding="UTF-8"?>
<work-item>
  <field id="assignee">
    <list>
      <item>XZM030</item>
    </list>
  </field>
  <field id="author">XZM030</field>
  <field id="automatedTestAffected" type="enum:productDocumentAffected">notRequired</field>
  <field id="created">2019-06-06 13:59:27.726 -0400</field>
  <field id="customerImpact" type="enum:productGenricYesNo">yes</field>
  <field id="customerImpactNotes" text-type="text/plain" type="text/html">See description</field>
  <field id="cyberSecurityAffected" type="enum:productGenricYesNo">no</field>
  <field id="datalinkTechData" type="enum:productDocumentAffected">notRequired</field>
  <field id="description" text-type="text/plain">​Navistar has reported that the transmission remains in Drive when the operator selects a fast sequence from Drive to Reverse to Manual mode. When selecting a similar sequence from Reverse to Drive to Manual mode, the transmission drive as expected.</field>
  <field id="designReviewComments" text-type="text/plain" type="text/html">​3/4/19-accepted with addtions to test plan</field>
  <field id="designReviewRequired" type="enum:productDocumentCompleted">completed</field>
  <field id="designedDate" type="date">2019-03-26</field>
  <field id="diagAffected" type="enum:productDiagAffected">no</field>
  <field id="fmeaRequired" type="enum:productDocumentAffected">notRequired</field>
  <field id="functionalSafetyAffected" type="enum:productGenricYesNo">no</field>
  <field id="linkedWorkItems">
    <list>
      <struct>
        <item id="role">affected_by</item>
        <item id="workItem">COMM-47223</item>
      </struct>
    </list>
  </field>
  <field id="priority">4.0</field>
  <field id="release" type="enum:release">na</field>
  <field id="requirementsAffected" type="enum:productDocumentAffected">notRequired</field>
  <field id="rootCauseDescription" text-type="text/plain" type="text/html">​The TCM logic that controls express preselect for the hold postion looks at if forward is attained but not the currently selected postion.   Therefore with a quick transistion from D-R-H the transmission does not have time to actually make a shift to Reverse and there for the forward attined is still true when hold is recieved. </field>
  <field id="screenedDate" type="date">2019-03-26</field>
  <field id="serviceImpact" type="enum:productGenricYesNo">yes</field>
  <field id="serviceImpactNotes" text-type="text/plain" type="text/html">affects OEMs using the non-ATI standard selector interface only.  OEMs using the non- ATI basic selector interface are not effected.</field>
  <field id="severity">normal</field>
  <field id="sharePointID" type="string">2884</field>
  <field id="simToolAffected" type="enum:productDocumentAffected">notRequired</field>
  <field id="softwareCRIsRequired" type="boolean">true</field>
  <field id="solutionDescription" text-type="text/plain" type="text/html">​The TCM logic that controls express preselect for the hold postion needs to look at the selected position and if forward is attined.</field>
  <field id="status">na</field>
  <field id="synergyCRNumber" type="string">,10516,</field>
  <field id="syscrType" type="string">Incident</field>
  <field id="techData(Regular)" type="enum:productDocumentAffected">notRequired</field>
  <field id="tempStatus" type="string">n/a</field>
  <field id="testPlanAffected" type="enum:productDocumentAffected">notRequired</field>
  <field id="testRunWhereValidated" type="string">BCD 191 PC</field>
  <field id="title">Other: OEM Standard Shift Selector D-to-R-to-Manual Transition Complaint</field>
  <field id="type">other</field>
  <field id="typeForDependencyOnly" type="enum:otherProductDependencyType">other</field>
  <field id="vepsqaAffected" type="enum:vepsAffected">no</field>
</work-item>

2 个答案:

答案 0 :(得分:0)

没有输入文件,请尝试以下更改:

[xml]$MyXML = Get-Content $file -Raw

编辑: 您还可以输出

$file | Out-File -Encoding "UTF8"

编辑编辑:

怎么办

$newfile = ValidateAssigneeField -assignee $assignee -fileContent $MyXML -fileURI $file -assignees $assigneeList 

$newfile | Out-File -Encoding "UTF8" -FilePath "DESTINATION"

答案 1 :(得分:0)

我建议不要使用“ >>”或“ out-file -append”。它可以在同一文件中混合使用不同的编码,尤其是由于外文件默认为unicode(utf16)。 “添加内容”效果更好。错误报告:https://github.com/PowerShell/PowerShell/issues/9423

相关问题