如何使用powershell将重复的xml节点转换为逗号分隔的字符串

时间:2013-01-10 17:56:03

标签: xml powershell nodes

我有大约13000个以XML格式格式化的日志文件,我需要将它们全部转换为电子表格\ csv文件。

你会看到我不是程序员,但我已经尝试过了 我已经编写了一个powershell脚本来获取第一个节点并创建一个逗号分隔的字符串,但我仍然坚持获取最后一个节点,它可以包含从无条目到数十个节点的任何内容。

xml文件的示例:

<?xml version="1.0" encoding="utf-8"?>
<MigrationUserStatus>
  <User>username@domain.com</User>
  <StoreList>
    <EmailMigrationStatus>
      <MigrationStatus value="Success" />
      <FolderList>
        <TotalCount value="6" />
        <SuccessCount value="3" />
        <FailCount value="3" />
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <MessageSubject>Hey</MessageSubject>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>2013-01-07T02:51:17.000Z</SentTime>
          <ReceiveTime>2013-01-07T02:51:17.000Z</ReceiveTime>
          <MessageSize value="2881" />
        </FailedMessages>
        <StartTime>2013-01-07T01:52:46.000Z</StartTime>
        <EndTime>2013-01-07T04:41:59.000Z</EndTime>
      </FolderList>
      <StartTime>2013-01-07T01:52:43.000Z</StartTime>
      <EndTime>2013-01-07T04:41:59.000Z</EndTime>
    </EmailMigrationStatus>
    <StartTime>2013-01-07T01:52:43.000Z</StartTime>
    <EndTime>2013-01-07T04:41:59.000Z</EndTime>
  </StoreList>
</MigrationUserStatus>

使用此代码,我可以轻松获得创建的csv行的第一部分:

$folder = "C:\temp"
$outfile = = [IO.File]::OpenWrite("alluserslogs.csv")
$csv = "User,Total Emails, Successful emails,Failed emails,Failures`r`n"

dir Status-*.log | foreach ( $_) {
[xml]$Status = Get-Content $_
$csvpt1 +=$Status.MigrationUserStatus.User + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

接下来就是我要解开的地方。我想读取每个FailedMessages节点并将其构建为另一个逗号分隔的字符串

foreach ($FMessage in $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages) {
$csvpt2 +=$FMessage + ","
}

期望的输出:

GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,.......

我在$ FMessage中空白或者方法调用因为最后的+“,”而失败,所以我需要修复它。

然后我将连接成一个最终字符串并写入文件

$csv +=$csvpt1 + "," + $csvpt2
$outfile.WriteLine($csv)
}
$outfile.Close()

在添加的愿望清单中,能够为最多数量的FailedMessages节点描述的n个列创建csv文件列标题失败也很棒。

非常感谢您的协助。

1 个答案:

答案 0 :(得分:1)

Powershell对XML有本机支持,也许这有助于您入门?

它还有一个带有Export-Csv的原生CSV导出器:)

[xml]$XMLfile = gc C:\Temp\migration.xml

$MasterArray = @()
$MasterArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

$MasterArray.User = $XMLfile.MigrationUserStatus.user
$MasterArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
$MasterArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
$MasterArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
$MasterArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

$Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
$ConcatFailures = @()
foreach ($Failure in $Failures)
{
    $ConcatFailures += $Failure.ErrorMessage + "," + $Failure.SentTime + "," + $Failure.ReceivedTime
}

$MasterArray.Failures = $ConcatFailures -Join "|"
$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"

对于其他字段,您可以检查它们是否存在并添加它们(如果它们非常容易),工作:

foreach ($Failure in $Failures)
{
    if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
    if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
    if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
}

要处理要添加外部循环以遍历所有xml文件的xml文件,然后将数据附加到您构建的数组中。这应该做你想要的,对所用的路径进行一些调整:

$XMLFiles = gci "C:\Temp\" -Filter "*.xml"
$MasterArray = @()

foreach ($XMLFile in $XMLFiles)
{
    [xml]$XMLfile = gc $XMLFile.FullName

    $TempArray = @()
    $TempArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

    $TempArray.User = $XMLfile.MigrationUserStatus.user
    $TempArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
    $TempArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
    $TempArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
    $TempArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

    $Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
    $ConcatFailures = @()

    foreach ($Failure in $Failures)
    {
        if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
        if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
        if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
    }
    $TempArray.Failures = $ConcatFailures -Join "|"

    $MasterArray += $TempArray
}

$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"