我有一个bibtex文件,它是其他几个.bib文件的合并。在合并过程中,除了一个重复的条目之外的所有条目都被注释掉,以便所有具有重复条目的案例如下所示。其中一些有20~30个条目被注释掉,使100个参考文件长30k行。
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
###Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
如何删除以###开头的所有行,直到带有@ exclusive的下一行?实质上,我的结果文件是:
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
例如sed'/ ### /,/ @ / {//!d}'bibliography.bib保持行以###开头,但是sed'/ ### /,/ @ / d'参考书目.bib以@离开开始行。
非常感谢您的帮助。
答案 0 :(得分:2)
使用$skip
哨兵值的简单解决方案:
use strict;
use warnings;
my $skip = 0;
while ( <> ) {
$skip = 1 if /^###/;
$skip = 0 if /^@/;
next if $skip;
print;
}
输出:
[hmcmillen]$ perl test.pl < test.txt
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
如果你真的希望它是一个命令:
perl -ne 'BEGIN { $SKIP = 1 } $SKIP = 1 if /^###/; $SKIP = 0 if /^@/; print unless $SKIP;' < test.txt
答案 1 :(得分:1)
假设您的输入文件是当前目录中某处或更低位置的所有*.bib
文件。
当天成为你的find
perl
魔术师:
find . -name '*.bib' -exec \
perl -i -ne '$o=1if/^@/;$o=0if/^###/;print if$o' \{} \;
如果您无法阅读此内容,请不要使用它。例如。它会在第一行@
之前删除任何内容,并且不会考虑缩进@
或###
行。
还有一个名为File::Find
的好模块,用perldoc File::Find
阅读所有相关内容。就个人而言,他们不会将此作为单行代表。
答案 2 :(得分:0)
使用awk:
$ awk '/###/{p=0} /@/{p=1} p' bib.text
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}