我需要帮助理解逻辑如何解析当前格式不正确的文本文件,因为它很难读取日志内容。文本输入文件如下所示:
========== Test1 (1) ========== Id UTC Date/Time Message 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Messagel Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3,4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3.,4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3. ========== Test2 (1) ========== Id UTC Date/Time Message 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='xyz', Connection='Usb', Fleet Report Id='ca9d09e7-1564-4066-8f5e-6a123456', Fleet Proxy Id ='0fsfsda7-c7e8-4bb2-9dd4-2f4c3b2498a3,4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3.,4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3.
有多个部分{Test1 test2 ... n},每个部分包含多个Id utc日期时间和消息,所有部分也以
开头,以
<结尾/ p>
如何以表格格式排列它们?需要以表格格式将输出格式化如下:
ID UTC Date/Time Message ========== Test1 (1) ========== Id UTC Date/Time Message 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Messagel Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3. 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3. 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message3 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3. ========== Test2 (1) ========== Id UTC Date/Time Message 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message1 Name='xyz', Connection='Usb', Fleet Report Id='ca9d09e7-1564-4066-8f5e-6a123456', Fleet Proxy Id ='0fsfsda7-c7e8-4bb2-9dd4-2f4c3b2498a3, 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message2 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3, 4d1eb19c-5420-4bb2-9e21-65880eb90429 08-30T01:26:24Z Message3 Name='abz', Connection='Usb', Fleet Report Id='ca9d3457-1564-4066-8f5e-12345678', Fleet Proxy Id ='ghjfda7-c7e8-4bb2-9dd4-2f4c3b2498a3.
这是我尝试过的,但它没有解析文本文件中的所有内容。
$file = Get-Content -path .\ViewSource.txt | Where-Object {
$_ -ne ""
} | ForEach-Object {
$_ -replace '<[^>]+>', ''
}
foreach ($line in $file) {
$elements = $line.Split(" ", [StringSplitOptions]::RemoveEmptyEntries)
[PSCustomObject]@{
Id = $elements[8]
UtcDateTime = $elements[9]
Message = $elements[10..19] -join " "
}
}
答案 0 :(得分:0)
由于您的ID和时间戳字段具有固定宽度,并且每行似乎没有多条消息,因此最简单的方法可能是使用格式正确/包装的标题行替换“内联”标题:< / p>
$inline = ' Id UTC Date/Time Message '
$wrapped = "`nId UTC Date/Time Message`n"
(Get-Content -Path 'C:\path\to\input.txt') -replace $inline, $wrapped |
Set-Content -Path 'C:\path\to\output.txt'
编辑:如果每行有多封邮件,则还需要匹配每封邮件之前的GUID和时间戳序列,并在这些匹配之前插入换行符:
$inline = ' Id UTC Date/Time Message '
$wrapped = "`nId UTC Date/Time Message"
$guid = '[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}'
$ts = '\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z'
(Get-Content 'C:\path\to\input.txt') -replace $inline, $wrapped -replace "($guid) +($ts) +", "`n`$1 `$2 " |
Set-Content -Path 'C:\path\to\output.txt'