解析文本类型日志文件,其值由双引号括起并用逗号分隔

时间:2014-04-01 15:02:34

标签: regex parsing powershell logfile

我有这个日志文件,我正在尝试解析它。 问题是数据行的格式为"值","值",......有时"值\"值\&#34 ;",...

#basepath  D:\XHostMachine\Results
#results   test.res
#fields    TestPlan Script TestCase TestData ErrorCount ErrorText DateTime Elapsed
#delimiter , 
#quote     " \

"D:\XHostMachine\plans\test.pln","D:\XHostMachine\testcases\test.t","rt1","1,\"a\"",1,"[#ERROR#][APPS-EUAUTO1] [error] rt1 t1 ( Screen shot : D:\XTestMachines\Error\[APPS-EUAUTO1] 03-28-14 11-29-22.png)","2014-03-28 11.29.04","0:00:18"
"D:\XHostMachine\plans\test.pln","D:\XHostMachine\testcases\test.t","rt2","1,\"a\"",0,"","2014-03-28 11.29.22","0:00:08"

但是我无法使用","作为分隔符来分割行(因为,可能存在于其中)

我的代码是:

Function Get-RexLog {
Param ($File)
# Reads the log file into memory.
    Try {
        Get-Content -path $File -ErrorAction Stop | select -skip 6 # skips the first 6 lines
    } Catch {
        Write-Error "The data file is not present" 
        BREAK
    }
} # End: Function Get-RexLog

# -----------------------------------------------------------------------

Function Get-Testplan {
Param ($RexLog)
    for ($i=0; $i -lt $RexLog.Count; $i++) {
        $Testcase = $RexLog[$i].Split("`"[,]`"") | ForEach-Object - process {$_.TrimStart('"')}
        $Output = New-Object PSobject -Property @{
            TestPlan   = $Testcase[0]
            Script     = $Testcase[1]
            TestCase   = $Testcase[2]
            TestData   = $Testcase[3]
            ErrorCount = $Testcase[4]
            ErrorText  = $Testcase[5]
            DateTime   = $Testcase[6]
            Elapsed    = $Testcase[7]
        }
    }
} # End: Function Get-Testplan

# -----------------------------------------------------------------------

# Parse the files
$RexLog = Get-RexLog -file "D:\XHostMachine\Results\test.rex"
$Testplan = Get-Testplan -RexLog $RexLog
$Testplan

最终编辑: 使用ConvertFrom-Csv

ConvertFrom-Csv -inputobject $RexLog -Header @("TestPlan","Script","TestCase","TestData","ErrorCount","ErrorText","DateTime","Elapsed")

1 个答案:

答案 0 :(得分:3)

powershell可以使用import-csv cmdlet轻松处理逗号分隔值文本文件(csv)。

看:

PS C:\temp> Import-Csv c:\temp\test.csv -Header @("TestPlan","Script","TestCase","TestData","ErrorCount","ErrorText","Da
teTime","Elapsed")


TestPlan   : D:\XHostMachine\plans\test.pln
Script     : D:\XHostMachine\testcases\test.t
TestCase   : rt1
TestData   : 1,\a\""
ErrorCount : 1
ErrorText  : [#ERROR#][APPS-EUAUTO1] [error] rt1 t1 ( Screen shot : D:\XTestMachines\Error\[APPS-EUAUTO1] 03-28-14
             11-29-22.png)
DateTime   : 2014-03-28 11.29.04
Elapsed    : 0:00:18

TestPlan   : D:\XHostMachine\plans\test.pln
Script     : D:\XHostMachine\testcases\test.t
TestCase   : rt2
TestData   : 1,\a\""
ErrorCount : 0
ErrorText  :
DateTime   : 2014-03-28 11.29.22
Elapsed    : 0:00:08