作为更广泛的脚本的一部分,我使用一系列perl -pi命令来摆脱LaTeX中的各种人工制品和错误拼写。
摘录如下:
perl -pi -e "s/”/\''/g" *.txt
perl -pi -e "s/“/\`\`/g" *.txt
perl -pi -e "s/,/, /g" *.txt
perl -pi -e "s/ ,/,/g" *.txt
perl -pi -e "s/ !/!/g" *.txt
perl -pi -e "s/\&/ and /g" *.txt
perl -pi -e "s/\n/\n\n/g" *.txt
perl -pi -e "s/\\\\em/\\\\em /g" *.txt
perl -pi -e "s/’/'/g" *.txt
perl -pi -e "s/\*\*\*/\\\\split/g" *.txt
* .txt大约有50-80个文件,这个代码段需要花费相当长的时间才能运行 - 我怀疑将该集合放入正确的perl脚本会提高效率。我的问题是:perl中的哪种方法有一组简单替换的最快执行时间?
答案 0 :(得分:3)
perl -i -pe'
s/”/\x27\x27/g;
s/“/``/g;
s/,/, /g;
...
' *.txt
但是,仍然会扫描每一行一百万次。以下内容避免了:
perl -i -pe'
BEGIN {
%tr = (
"”" => "\x27\x27",
"“" => "``",
"," => ", ",
...
);
$pat = join "|", map quotemeta, keys(%tr);
}
s/($pat)/$tr{$1}/g;
' *.txt
答案 1 :(得分:2)
您可能希望在一次通过而不是十次通过中进行替换,
script.pl
s/”/\''/g;
s/“/\`\`/g;
s/,/, /g;
s/ ,/,/g;
s/ !/!/g;
s/\&/ and /g;
s/\n/\n\n/g;
s/\\\\em/\\\\em /g;
s/’/'/g;
s/\*\*\*/\\\\split/g;
执行脚本,
perl -pi script.pl *.txt
答案 2 :(得分:1)
只需将所有替换拉成一行:
perl -pi -e "s/”/\''/g; s/“/\`\`/g; s/,/, /g; s/ ,/,/g; s/ !/!/g; s/\&/ and /g; s/\n/\n\n/g; s/\\\\em/\\\\em /g; s/’/'/g; s/\*\*\*/\\\\split/g" *.txt
只阅读,编写和解析文件一次肯定会比多次执行快得多。