我试图使用bash解析许多日志文件。日志文件如下所示:
"",1/8/2016
"Timezone",-6
"Serial No.","000001"
"Location:","LS_trap_2c"
"High temperature limit (�C)",-20
"Low temperature limit (�C)",-40
"Date - Time","Temperature (�C)"
"8/11/2015 12:00",28.0
"8/11/2015 14:00",28.5
"8/11/2015 16:00",24.0
"",1/8/2016
"Timezone",-6
"Serial No.","000002"
"Location:","LS_trap_2D"
"High temperature limit (�C)",-20
"Low temperature limit (�C)",-40
"Date - Time","Temperature (�C)"
"8/11/2015 12:00",28.0
"8/11/2015 14:00",28.5
我想将序列号和位置(以及稍后的其他位置)附加到每一行,直到到达下一个标题,然后将其输出到master.csv
文件。该文件应该最终看起来像这样:
"",1/8/2016
"Timezone",-6
"Serial No.","000001"
"Location:","Trap_2c"
"High temperature limit (�C)",-20
"Low temperature limit (�C)",-40
"Date - Time","Temperature (�C)"
LS_trap_2c,000001,"8/11/2015 12:00",28.0
LS_trap_2c,000001,"8/11/2015 14:00",28.5
LS_trap_2c,000001,"8/11/2015 16:00",24.0
"",1/8/2016
"Timezone",-6
"Serial No.","00002"
"Location:","LS_trap_2D"
"High temperature limit (�C)",-20
"Low temperature limit (�C)",-40
"Date - Time","Temperature (�C)"
LS_trap_2D,00002,"8/11/2015 12:00",28.0
LS_trap_2D,00002,"8/11/2015 14:00",28.5
这是一个帮助我使用bash sed处理类似文件的问题:
Bash Append header information to each line of a file until next header found
此oneliner非常适合查找标题,将其存储在保留空间中并将其添加到每行的前面
sed -r '/^"/h;//!{G;s/(.*)\n.*"(.*)"/\2,\1/}' fil.csv >masfil.csv
这种方法没有用于在前面附加多个字符串,因为我不确定如何使用sed的多个保持空间。另外,我不确定sed是否是最好的方法。我对sed不是很熟悉所以任何指针都会非常感激。
答案 0 :(得分:1)
awk
救援!
假设您的数据一致
awk -F, '/"Serial No."/ {sn = $2}
/"Location:"/ {loc = $2}
/"([0-9]{1,2}\/){2}[0-9]{4} [0-9]{2}:[0-9]{2}"/
{$0 = loc FS sn FS $0}1' file
在分配sn和loc时,您可以使用gsub(/"/,"",$2)
删除引号,但由于引用了其余字段,因此无法将其删除。
答案 1 :(得分:0)
sed中只有一个保留空间,但在这种情况下你不需要多个:
/^"Serial No."/ { # If we are on the "Serial No." line...
N # Append next line to pattern space
h # Copy pattern space to hold space
# Remove everything but location and serial number from pattern space
s/"[^"]*","([^"]*)"\n"[^"]*","([^"]*)"/\1,\2,/
x # Swap pattern space and hold space
}
/^"[[:digit:]]/ { # We are on a line where we want to prepend our data
G # Append hold space to pattern space
s/(.*)\n(.*)/\2\1/ # Move hold space content to front of pattern space
}
如果它存储在文件sedscr.sed
中,则可以像
sed -E -f sedscr.sed infile
这个 删除示例输入/输出中显示的双引号;它还假设数据应该加在前面的行是带有日期的行,即以双引号和数字开头。