我有大约13000个以XML格式格式化的日志文件,我需要将它们全部转换为电子表格\ csv文件。
你会看到我不是程序员,但我已经尝试过了 我已经编写了一个powershell脚本来获取第一个节点并创建一个逗号分隔的字符串,但我仍然坚持获取最后一个节点,它可以包含从无条目到数十个节点的任何内容。
xml文件的示例:
<?xml version="1.0" encoding="utf-8"?>
<MigrationUserStatus>
<User>username@domain.com</User>
<StoreList>
<EmailMigrationStatus>
<MigrationStatus value="Success" />
<FolderList>
<TotalCount value="6" />
<SuccessCount value="3" />
<FailCount value="3" />
<FailedMessages>
<ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
<SentTime>1601-01-01T00:00:00.000Z</SentTime>
<ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
</FailedMessages>
<FailedMessages>
<ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
<SentTime>1601-01-01T00:00:00.000Z</SentTime>
<ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
</FailedMessages>
<FailedMessages>
<MessageSubject>Hey</MessageSubject>
<ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
<SentTime>2013-01-07T02:51:17.000Z</SentTime>
<ReceiveTime>2013-01-07T02:51:17.000Z</ReceiveTime>
<MessageSize value="2881" />
</FailedMessages>
<StartTime>2013-01-07T01:52:46.000Z</StartTime>
<EndTime>2013-01-07T04:41:59.000Z</EndTime>
</FolderList>
<StartTime>2013-01-07T01:52:43.000Z</StartTime>
<EndTime>2013-01-07T04:41:59.000Z</EndTime>
</EmailMigrationStatus>
<StartTime>2013-01-07T01:52:43.000Z</StartTime>
<EndTime>2013-01-07T04:41:59.000Z</EndTime>
</StoreList>
</MigrationUserStatus>
使用此代码,我可以轻松获得创建的csv行的第一部分:
$folder = "C:\temp"
$outfile = = [IO.File]::OpenWrite("alluserslogs.csv")
$csv = "User,Total Emails, Successful emails,Failed emails,Failures`r`n"
dir Status-*.log | foreach ( $_) {
[xml]$Status = Get-Content $_
$csvpt1 +=$Status.MigrationUserStatus.User + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value
接下来就是我要解开的地方。我想读取每个FailedMessages节点并将其构建为另一个逗号分隔的字符串
foreach ($FMessage in $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages) {
$csvpt2 +=$FMessage + ","
}
期望的输出:
GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z 1601-01-01T00:00:00.000Z,GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z 1601-01-01T00:00:00.000Z,.......
我在$ FMessage中空白或者方法调用因为最后的+“,”而失败,所以我需要修复它。
然后我将连接成一个最终字符串并写入文件
$csv +=$csvpt1 + "," + $csvpt2
$outfile.WriteLine($csv)
}
$outfile.Close()
在添加的愿望清单中,能够为最多数量的FailedMessages节点描述的n个列创建csv文件列标题失败也很棒。
非常感谢您的协助。
答案 0 :(得分:1)
Powershell对XML有本机支持,也许这有助于您入门?
它还有一个带有Export-Csv的原生CSV导出器:)
[xml]$XMLfile = gc C:\Temp\migration.xml
$MasterArray = @()
$MasterArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures
$MasterArray.User = $XMLfile.MigrationUserStatus.user
$MasterArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
$MasterArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
$MasterArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
$MasterArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value
$Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
$ConcatFailures = @()
foreach ($Failure in $Failures)
{
$ConcatFailures += $Failure.ErrorMessage + "," + $Failure.SentTime + "," + $Failure.ReceivedTime
}
$MasterArray.Failures = $ConcatFailures -Join "|"
$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"
对于其他字段,您可以检查它们是否存在并添加它们(如果它们非常容易),应工作:
foreach ($Failure in $Failures)
{
if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
}
要处理要添加外部循环以遍历所有xml文件的xml文件,然后将数据附加到您构建的数组中。这应该做你想要的,对所用的路径进行一些调整:
$XMLFiles = gci "C:\Temp\" -Filter "*.xml"
$MasterArray = @()
foreach ($XMLFile in $XMLFiles)
{
[xml]$XMLfile = gc $XMLFile.FullName
$TempArray = @()
$TempArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures
$TempArray.User = $XMLfile.MigrationUserStatus.user
$TempArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
$TempArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
$TempArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
$TempArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value
$Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
$ConcatFailures = @()
foreach ($Failure in $Failures)
{
if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
}
$TempArray.Failures = $ConcatFailures -Join "|"
$MasterArray += $TempArray
}
$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"