我有一个文本文件toto.txt,内容如下:
Time: 11/23/2018 17:03:46
User: NEON
Web Site: https://www.seznam.cz
Top
Time: 11/23/2018 17:05:10
User: NEON
Web Site: www.autojournal.cz%252Fstat-prodava-zabavena-auta-padouchu-budou-levnejsi-nez-jine-ojetiny-2%252F/keFrdPDIZzLJBC2fxX7EIQ?utm_source=www.seznam.cz&utm_medium=sekce-z-internetu
Top
Time: 11/23/2018 17:05:11
User: NEON
Web Site: www.autojournal.cz/stat-prodava-zabavena-auta-padouchu-budou-levnejsi-nez-jine-ojetiny-2/?utm_source=www.seznam.cz&utm_medium=sekce-z-internetu
Top
... etc. ...
导出数据的代码
((Get-Content C:\Users\user\Desktop\test\toto.txt -RAW) -split '\n(?=Time:)') | % {
$x = $_ -split '\r'
New-Object PSOBJECT -Property @{
Time = [regex]::Match($x[0],'(?<=Time:\s*)\b.*\b')
User = [regex]::Match($x[1],'(?<=User:\s*)\b.*\b')
Web = [regex]::Match($x[2],'(?<=Site:\s*)\b.*\b')
}
} | out-file C:\Users\user\Desktop\test\result.txt
问题在于,result.txt中没有长网址(网站)。
我需要result.txt的结构:
datetime; $ url例如:2019-01-15 15:06:03; $ www.autojournal.cz / stat-prodava-zabavena-auta-padouchu-budou-levnejsi-nez-jine-ojetiny-2 /?utm_source = www.seznam.cz&utm_medium = sekce-z-internetu < / p>
在result.txt中,我得到:11/23/2018 17:05:10 NEON www.autojournal.cz%252Fstat-prodava-zabavena-auta-padouchu-budou-levnejsi-nez-jine-ojetiny-2%25 ...
我可以转换的日期时间:
(Get-Content C:\Users\user\Desktop\test\result.txt) |
Foreach-Object {$_ -replace "([0-9]+)/+([0-9]+)/+([0-9]+)", '$3-$1-$2'} |
Foreach-Object {$_ -replace "([0-9]+):+([0-9]+):+([0-9]+)", '$1-$2-$3;$'} |
Set-Content C:\Users\user\Desktop\test\result2.txt
((Get-Content C:\Users\user\Desktop\test\toto.txt -RAW) -split'\n(?=Time:)') | % {
$x = $_ -split '\r'
New-Object PSOBJECT -Property @{
Time = [regex]::Match($x[0],'(?<=Time:\s*)\b.*\b')
User = [regex]::Match($x[1],'(?<=User:\s*)\b.*\b')
Web = [regex]::Match($x[2],'(?<=Site:\s*)\b.*\b')
} } | out-file C:\Users\user\Desktop\test\result.txt
(Get-Content C:\Users\user\Desktop\test\result.txt) | Foreach-Object {$_ -replace "([0-9]+)/+([0-9]+)/+([0-9]+)", '$3-$1-$2'} | Foreach-Object {$_ -replace "([0-9]+):+([0-9]+):+([0-9]+)", '$1-$2-$3;$'} | Set-Content C:\Users\user\Desktop\test\result2.txt
答案 0 :(得分:0)
输出文件具有“宽度”参数。您可以使用它来阻止它缩短线段
((Get-Content C:\Users\user\Desktop\test\toto.txt -RAW) -split '\n(?=Time:)') | % {
$x = $_ -split '\r'
New-Object PSOBJECT -Property @{
Time = [regex]::Match($x[0],'(?<=Time:\s*)\b.*\b')
User = [regex]::Match($x[1],'(?<=User:\s*)\b.*\b')
Web = [regex]::Match($x[2],'(?<=Site:\s*)\b.*\b')
}
} | out-file C:\Users\user\Desktop\test\result.txt -Width 10000
您还应该考虑使用Import-Csv,Export-Csv和[PSCustomObjects]处理CSV文件。比分开txt文件更容易。
((Get-Content C:\Users\user\Desktop\test\toto.txt -RAW) -split '\n(?=Time:)') | % {
$x = $_ -split '\r'
New-Object PSOBJECT -Property @{
Time = [regex]::Match($x[0],'(?<=Time:\s*)\b.*\b')
User = [regex]::Match($x[1],'(?<=User:\s*)\b.*\b')
Web = [regex]::Match($x[2],'(?<=Site:\s*)\b.*\b')
}
} | Export-Csv C:\Users\user\Desktop\test\result.txt -Delimiter ";" -NoTypeInformation