我有一个非常大的输入集,看起来像这样:
Label: foo, Other text: text description...
<insert label> Item: item description...
<insert label> Item: item description...
Label: bar, Other text:...
<insert label> Item:...
Label: baz, Other text:...
<insert label> Item:...
<insert label> Item:...
<insert label> Item:...
...
我想将其转换为拉出标签名称(例如"foo"
)并将以下行中的标记"<insert label>"
替换为实际标签。
Label: foo, Other text: text description...
foo Item: item description...
foo Item: item description...
Label: bar, Other text:...
bar Item:...
Label: baz, Other text:...
baz Item:...
baz Item:...
baz Item:...
...
可以使用sed或awk或其他unix工具完成吗?如果是这样,我该怎么做?
答案 0 :(得分:5)
这是我的label.awk文件:
/^Label:/ {
label = $2
sub(/,$/, "", label)
}
/<insert label>/ {
sub(/<insert label>/, label)
}
1
要调用:
awk -f label.awk data.txt
答案 1 :(得分:2)
您可以像这样使用awk:
awk '$1=="Label:" {label=$2; sub(/,$/, "", label);}
$1=="<insert" && $2=="label>" {$1=" "; $2=label;}
{print $0;}' file
答案 2 :(得分:2)
使用sed
的一个解决方案:
script.sed
的内容:
## When line beginning with the 'label' string.
/^Label/ {
## Save content to 'hold space'.
h
## Get the string after the label (removing all other characters)
s/^[^ ]*\([^,]*\).*$/\1/
## Save it in 'hold space' and get the original content
## of the line (exchange contents).
x
## Print and read next line.
b
}
###--- Commented this wrong behaviour ---###
#--- G
#--- s/<[^>]*>\(.*\)\n\(.*\)$/\2\1/
###--- And fixed with this ---###
## When line begins with '<insert label>'
/<insert label>/ {
## Append the label name to the line.
G
## And substitute the '<insert label>' string with it.
s/<insert label>\(.*\)\n\(.*\)$/\2\1/
}
infile
的内容:
Label: foo, Other text: text description...
<insert label> Item: item description...
<insert label> Item: item description...
Label: bar, Other text:...
<insert label> Item:...
Label: baz, Other text:...
<insert label> Item:...
<insert label> Item:...
<insert label> Item:...
像以下一样运行:
sed -f script.sed infile
结果:
Label: foo, Other text: text description...
foo Item: item description...
foo Item: item description...
Label: bar, Other text:...
bar Item:...
Label: baz, Other text:...
baz Item:...
baz Item:...
baz Item:...