从包含特定模式的日志中提取数据

时间:2018-12-14 05:26:22

标签: powershell batch-file cmd

我有一个Apache日志文件,其中的行格式如下:

192.168.100.1 - - [13/Dec/2018:15:11:52 -0600] "GET/onabc/soitc/BackChannel/?param=369%2FGetTableEntryList%2F7%2Fonabc-s31%2FHPD%3AIncident%20Management%20Console27%2FDefault%20User%20View%20(Manager)9%2F3020872007%2Resolved%22%20AND%20((%27Assignee%20Login%20ID%27%20%3D%20%22Allen%22)Token=FEIH-MTJQ-H9PR-LQDY-WIEA-ZULM-45FU-P1FK HTTP/1.1"    

我需要从Apache日志文件中提取一些数据,以防该行包含“ login”字样并列出IP,日期和登录ID(在这种情况下,“ Allen”是登录ID)或保存它们在另一个文件中。

由于您的建议,我现在正在使用PowerShell使其有效,现在我拥有了:

$Readlog = Get-content -path C:\Example_log.txt
$Results = foreach ($Is_login in $Readlog)
{
    if ($Is_login -match 'login')
    {
        [PSCustomObject]@{
            IP = $Is_login.Split(' ')[0]#No need to trim the start.
            Date = $Is_login.Split('[]')[1].Split(':')[0]
            Hour = $Is_login.Split('[]')[1].Split(' ')[0] -Replace ('\d\d\/\w\w\w\/\d\d\d\d:','')
            LoginID = select-string -InputObject $Is_login -Pattern "(?<=3D%20%22)\w{1,}" -AllMatches | % {$_.Matches.Groups[0].Value}
            Status = select-string -InputObject $Is_login -Pattern "(?<=%20%3C%20%22)\w{1,}" -AllMatches | % {$_.Matches.Groups[0].Value}
        }
    }
}
$Results

感谢您的提示,现在我得到了以下结果:

IP      : 192.168.100.1
Date    : 13/Dec/2018
Hour    : 15:11:52
LoginID : Allen
Status  : Resolved

IP      : 192.168.100.30
Date    : 13/Dec/2018
Hour    : 16:05:31
LoginID : Allen
Status  : Resolved

IP      : 192.168.100.40
Date    : 13/Dec/2018
Hour    : 15:11:52
LoginID : ThisisMyIDHank
Status  : Resolved

IP      : 192.168.100.1
Date    : 13/Dec/2018
Hour    : 15:11:52
LoginID : Hank
Status  : Resolved

感谢大家的帮助。

1 个答案:

答案 0 :(得分:0)

[在示例数据中使用非真的星号替换了代码。]

[powershell v5.1]
这将匹配包含“ login”的任何行,然后使用基本的字符串运算符提取请求的信息。我试图使用正则表达式,但在模式匹配中陷入了困境。 [ blush ]正则表达式几乎可以肯定会更快,但这对我来说更容易理解。

# fake reading in a text file
#    in real life, use Get-Content
$InStuff = @'
192.168.100.1 - - [13/Dec/2018:15:11:52 -0600] "GET/onabc/soitc/BackChannel/?param=369%2FGetTableEntryList%2F7%2Fonabc-s31%2FHPD%3AIncident%20Management%20Console27%2FDefault%20User%20View%20(Manager)9%2F3020872007%2Resolved%22%20AND%20((%27Assignee%20Login%20ID%27%20%3D%20%22Allen%22)Token=FEIH-MTJQ-H9PR-LQDY-WIEA-ZULM-45FU-P1FK HTTP/1.1"
100.100.100.100 - - [06/Nov/2018:10:10:10 -0666] "nothing that contains the trigger word"
'@ -split [environment]::NewLine

$Results = foreach ($IS_Item in $InStuff)
    {
    if ($IS_Item -match 'login')
        {
        # build a custom object with the desired items
        #    the PSCO makes export to a CSV file very, very easy [*grin*] 
        # the split pattern is _very fragile_ and will break if the pattern is not consistent
        #    a regex pattern would likely be both faster and less fragile, but i can't figure one out
        [PSCustomObject]@{
            IP = $IS_Item.Split(' ')[0].TrimStart('**')
            Date = $IS_Item.Split('[}')[1].Split(':')[0]
            # corrected for not-really-there asterisks
            #LoginName = $IS_Item.Split('*')[-3]
            LoginName = (($IS_Item.Split(')')[-2] -replace '%\w{2}') -csplit 'ID')[1]
            }
        }
    }

# show on screen
$Results

# save to a CSV file
$Results |
    Export-Csv -LiteralPath "$env:TEMP\Henry_Chinasky_-_LogExtract.CSV" -NoTypeInformation

屏幕输出...

IP            Date        LoginName
--            ----        ---------
192.168.100.1 13/Dec/2018 Allen   

csv文件内容...

"IP","Date","LoginName"
"192.168.100.1","13/Dec/2018","Allen"