我有一个文件:
@Book{gjn2011ske,
author = {Grzegorz J. Nalepa},
title = {Semantic Knowledge Engineering. A Rule-Based Approach},
publisher = {Wydawnictwa AGH},
year = 2011,
address = {Krak\'ow}
}
@article{gjn2010jucs,
Author = {Grzegorz J. Nalepa},
Journal = {Journal of Universal Computer Science},
Number = 7,
Pages = {1006-1023},
Title = {Collective Knowledge Engineering with Semantic Wikis},
Volume = 16,
Year = 2010
}
我想改进只删除第一行的正则表达式。 注意:无法更改记录分隔符RS="}\n"
。
我试过了:
awk 'BEGIN{ RS="}\n" } {gsub(/@.*,/,"") ; print }' file
我想打印结果:
author = {Grzegorz J. Nalepa},
title = {Semantic Knowledge Engineering. A Rule-Based Approach},
publisher = {Wydawnictwa AGH},
year = 2011,
address = {Krak\'ow}
Author = {Grzegorz J. Nalepa},
Journal = {Journal of Universal Computer Science},
Number = 7,
Pages = {1006-1023},
Title = {Collective Knowledge Engineering with Semantic Wikis},
Volume = 16,
Year = 2010
感谢您的帮助。
修改
我建议的解决方案:
awk 'BEGIN{ RS="}\n" }{sub(",","@"); sub(/@.*@/,""); print }' file
答案 0 :(得分:2)
使用指定的RS
设置很难完成您想要的任务(因为address = {Krak\'ow}
有一个额外的记录结束)。我宁愿选择:
awk '$0 !~ "^@" && $0 !~ "^} *$" { print }' FILE
编辑我不知道为什么它必须使用正则表达式解决方案,你能解释一下吗?
无论如何,还有另一个(working, see here)解决方案使用正则表达式,而不是你期望的解决方案。:
awk 'BEGIN{ RS="}\n" }
{
split($0,a,"\n")
for (e=1;e<=length(a);e++) {
if (a[e] ~ "{" && a[e] !~ "}") {
sub("$","}",a[e])
}
if (a[e] ~ "=") { print a[e] }
}
printf("\n")
}' INPUTFILE
还有一个更简单的正则表达式,但它失败了,最后address
的“}
”行将被RS
删除,并且会打印出来最后}
...
awk 'BEGIN{ RS="}\n" }
{
sub("@[^,]\+,","")
print $0
}' INPUTFILE
答案 1 :(得分:2)
不使用正则表达式的一种方法。将字段分隔符设置为换行符,现在寄存器的每个键都是一个字段。然后,遍历每个字段并打印那些不以@
开头的字段:
awk '
BEGIN {
RS="}\n";
FS=OFS="\n";
}
{
for (i=1; i<=NF; i++) {
if ( substr($i, 1, 1) != "@" ) {
printf "%s%s", $i, (i == NF) ? RS : OFS;
}
}
}
' file
输出:
author = {Grzegorz J. Nalepa},
title = {Semantic Knowledge Engineering. A Rule-Based Approach},
publisher = {Wydawnictwa AGH},
year = 2011,
address = {Krak\'ow}
Author = {Grzegorz J. Nalepa},
Journal = {Journal of Universal Computer Science},
Number = 7,
Pages = {1006-1023},
Title = {Collective Knowledge Engineering with Semantic Wikis},
Volume = 16,
Year = 2010
答案 2 :(得分:2)
我会使用GNU sed
来执行此操作:
sed '/^@/,/^}$/ { //d }' file.txt
结果:
author = {Grzegorz J. Nalepa},
title = {Semantic Knowledge Engineering. A Rule-Based Approach},
publisher = {Wydawnictwa AGH},
year = 2011,
address = {Krak\'ow}
Author = {Grzegorz J. Nalepa},
Journal = {Journal of Universal Computer Science},
Number = 7,
Pages = {1006-1023},
Title = {Collective Knowledge Engineering with Semantic Wikis},
Volume = 16,
Year = 2010
请注意,您可以使用-i
标志进行就地更改(即覆盖文件内容),并且可以使用-s
标志对多个文件进行更改。例如:
sed -s -i '/^@/,/^}$/ { //d }' *.txt
答案 3 :(得分:1)
awk '{if($0!~/@/&&$0!~/^}/)print}' temp
测试如下:
> awk '{if($0!~/@/&&$0!~/^}/)print}' temp
author = {Grzegorz J. Nalepa},
title = {Semantic Knowledge Engineering. A Rule-Based Approach},
publisher = {Wydawnictwa AGH},
year = 2011,
address = {Krak\'ow}
Author = {Grzegorz J. Nalepa},
Journal = {Journal of Universal Computer Science},
Number = 7,
Pages = {1006-1023},
Title = {Collective Knowledge Engineering with Semantic Wikis},
Volume = 16,
Year = 2010
>