解析类似XML的日志文件

时间:2014-01-10 23:31:04

标签: xml powershell

我有一个记录事件的日志文件,如下所示。我想将每个事件转换为PSCustomobject。它看起来像XML,但是将xml转换为文件的Get-Content会给我一个错误:

无法将值“System.Object []”转换为“System.Xml.XmlDocument”。错误:“此文档已有'DocumentElement'节点。”

<event date='Jan 06 01:46:16' severity='4' hostName='ABC' source='CSMFAgentPolicyManager' module='smfagent.dll' process='AeXNSAgent.exe' pid='1580' thread='1940' tickCount='306700046' >
  <![CDATA[Setting wakeup time to 3600000 ms (Invalid DateTime) for policy: DefaultWakeup]]>
</event>

这是我到目前为止的一段代码

   <#
.EXAMPLE    
source    : MaintenanceWindowMgr
process   : AeXNSAgent.exe
thread    : 8500
hostName  : ABC
severity  : 4
tickCount : 717008140
date      : Jan 10 19:45:00
module    : PatchMgmtAgents.dll
pid       : 11984
CData     : isAbidingByMaintenanceWindows() - yes
#>
$logpath = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'
$hash=[ordered]@{};
$log = get-content $logpath | % {

    ## handle Event start
    ## sample: <event date='Jan 10 18:45:00' severity='4' hostName='ABC' source='MaintenanceWindowMgr' module='PatchMgmtAgents.dll' process='AeXNSAgent.exe' pid='11984' thread='8500' tickCount='713408140' >
    if ($_ -match '^<event') {

        if ($hash) {                
            ## Convert the hastable to PSCustomObject before clearing it
            New-Object PSObject -Property $hash
            $hash.Clear()
        }

        $line = $_ -replace '<event ' -replace ' >' -split "'\s" -replace "'"               
        $line | % { 

            $name,$value=$_ -split '='                
            $hash.$name=$value
        }        
    }

    ## handle CData
    ## Sample: <![CDATA[Schedule Software Update Application Task ({A1939DC8-DA4A-4E46-9629-0500C2383ECA}) triggered at 2014-01-10 18:50:00 -5:00]]>
    if ($_ -match '<!') {
        $hash.'CData' = ($_ -replace '<!\[CDATA\[' -replace '\]\]>$').ToString().Trim()
    }
}
  $log 

不幸的是,该对象不是我想要的形式。

$log|gm


   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType Definition                    
----        ---------- ----------                    
Equals      Method     bool Equals(System.Object obj)
GetHashCode Method     int GetHashCode()             
GetType     Method     type GetType()                
ToString    Method     string ToString()   

当我尝试从输出中收集所有对象时,我丢失了将哈希转换为PSCustomObject时生成的NoteProperties

   TypeName: System.Management.Automation.PSCustomObject

Name        MemberType   Definition                                                                                                                                     
----        ----------   ----------                                                                                                                                     
Equals      Method       bool Equals(System.Object obj)                                                                                                                 
GetHashCode Method       int GetHashCode()                                                                                                                              
GetType     Method       type GetType()                                                                                                                                 
ToString    Method       string ToString()                                                                                                                              
Equals      Method       bool Equals(System.Object obj)                                                                                                                 
GetHashCode Method       int GetHashCode()                                                                                                                              
GetType     Method       type GetType()                                                                                                                                 
ToString    Method       string ToString()                                                                                                                              
CData       NoteProperty System.String CData=isAbidingByMaintenanceWindows() - yes                                                                                      
date        NoteProperty System.String date=Jan 10 18:45:00                                                                                                             
hostName    NoteProperty System.String hostName=ABC                                                                                                             
module      NoteProperty System.String module=PatchMgmtAgents.dll                                                                                                       
pid         NoteProperty System.String pid=11984                                                                                                                        
process     NoteProperty System.String process=AeXNSAgent.exe                                                                                                           
severity    NoteProperty System.String severity=4                                                                                                                       
source      NoteProperty System.String source=MaintenanceWindowMgr                                                                                                      
thread      NoteProperty System.String thread=8500                                                                                                                      
tickCount   NoteProperty System.String tickCount=713408140 

我在这里缺少什么?

3 个答案:

答案 0 :(得分:4)

XML文件必须具有单个根(或documentElement)节点。由于您的日志文件似乎包含多个<event>标记而没有公共根元素,因此您可以像这样添加缺少的documentElement

$logpath  = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'
[xml]$log = "<logroot>$(Get-Content $logpath)</logroot>"

之后,您可以使用常用方法处理日志,例如:

$fmt = 'MMM dd HH:mm:ss'

$log.SelectNodes('//event') |
  select @{n='date';e={[DateTime]::ParseExact($_.date, $fmt, $null)}},
         severity, hostname, @{n='message';e={$_.'#cdata-section'}}

如果您更喜欢自定义对象,可以轻松地创建它们:

$fmt = 'MMM dd HH:mm:ss'

$log.SelectNodes('//event') | % {
  New-Object -Type PSObject -Property @{
    'Date'     = [DateTime]::ParseExact($_.date, $fmt, $null)
    'Severity' = $_.severity
    'Hostname' = $_.hostname
    'Message'  = $_.'#cdata-section'
  }
}

答案 1 :(得分:1)

使用拆分方法:

$hash = [ordered]@{}
$regex = '^<event (.+) >$'
$lines = (gc $file) -match $regex -replace $regex,'$1'
foreach ($line in $lines)
 {
         $hash.Clear() 
         $line -split "'\s" -replace "'" |
         foreach {
                   $name,$value=$_ -split '='                
                   $hash.$name=$value
                 }

        [PSCustomObject]$hash 
} 

答案 2 :(得分:0)

我最初认为我的问题是原始哈希没有被排序,但后来发现实际问题在哪里。下面的代码导致初始PSCustomObject没有创建任何NoteProperty:

  if ($hash) { .... }

即使刚刚初始化的哈希满足如下所示:

PS H:\> $myhash=[ordered]@{}
PS H:\> if ($myhash) {"yay"}
yay

所以要解决它,我只是更改了支票

# CData is the last record, if hash has it, it's ready to convert to PSCustomObject
if ($hash.CData) { ... }  

以下是更新后的代码:

   $hash=[ordered]@{}        
    $logpath = Join-Path $env:ProgramData 'Symantec\Symantec Agent\logs\Agent.log'       
    Get-Content $logpath | % {

        ## handle Event start            
        if ($_ -match '^<event') {       
            # CData is the last record, if hash has it, it's ready to convert to PSCustomObject
            if ($hash.CData) {                        
                ## Convert the hastable to PSCustomObject before clearing it
                [PSCustomObject]$hash                
                $hash.Clear()
            }

            ## sample: <event date='Jan 10 18:45:00' severity='4' hostName='ABC' source='MaintenanceWindowMgr' module='PatchMgmtAgents.dll' process='AeXNSAgent.exe' pid='11984' thread='8500' tickCount='713408140' >
            $line = $_ -replace '<event ' -replace ' >' -split "'\s" -replace "'"               
            $line | % { 
                        $name,$value=$_ -split '='                               
                            $hash.$name=$value                        
            }        
        }

        ## handle CData
        ## Sample: <![CDATA[Schedule Software Update Application Task ({A1939DC8-DA4A-4E46-9629-0500C2383ECA}) triggered at 2014-01-10 18:50:00 -5:00]]>
        if ($_ -match '<!') {
            $hash.'CData' = ($_ -replace '<!\[CDATA\[' -replace '\]\]>$').ToString().Trim()
        }
    }  

感谢@mjolinor提供的有用评论!