我想使用sed
删除文本文件中的所有注释。假设评论从“A”字符开始,以新行字符结束。我想删除从“A”到行尾的所有内容,包括换行符。但是,我不想删除从“AA”开始的评论。
示例输入:
%% comment to do not delete
% comment to delete
% another comment to delte
%% comment to do not delete
Some text % comment to delete
and some more text %% comment to do not delete
期望的输出:
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
答案 0 :(得分:2)
尝试这样做:
$ perl -pe '/^[^%]*%%/ && next; s/%.*\n//g' file.txt
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
如果您需要就地更改文件,请添加-i
开关(在测试后),以便:
$ perl -i -pe '/^[^%]*%%/ && next; s/%.*\n//g' file.txt
感谢scrutinizer的贡献。
答案 1 :(得分:2)
完美应用perl的负面后视断言:
perl -pe 's/(?<!%)%(?!%).*$//s' << END
%% comment to do not delete
% comment to delete
% another comment to delte
%% comment to do not delete
Some text % comment to delete
and some more text %% comment to do not delete
END
输出
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
s
标志确保点将与换行符匹配,以按要求实现“换行”。
这种正则表达式匹配可能会导致您遇到问题,例如,如果您有像
这样的行The date is `date +%Y%m%d` % this is a comment
你最终会得到
The date is `date +
如果您的实际评论需要周围的空白,您可以使用此正则表达式:
(^| )%( .*|)$
表示
答案 2 :(得分:1)
也许这就是:
第二次更新
$ sed -e '/^%[^%]/d' -e 's/ %[^%]*$/@/' -e :a -e '/@/N; s/\n//; ta' input | sed 's/@/ /g'
%% comment to do not delete
%% comment to do not delete
Some text and some more text %% comment to do not delete
答案 3 :(得分:0)
编辑添加了更改,以便在文件的最后一行正常运行... 尝试:
sed -e :a -e '/^[^%]*%%/n; /%/{s/%.*//; N; s/\n//;};ta' file
使用输入测试:
%% comment to do not delete
% comment to delete
% another comment to delte
%
%% comment to do not delete
Some text % comment to delete
Some more text % more comment to delete
and some more text %% comment to do not delete
fdgdfgdgdgd %
gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by % comment to delete that contains %% somewhere
hello there
输出:
%% comment to do not delete
%% comment to do not delete
Some text Some more text and some more text %% comment to do not delete
fdgdfgdgdgd gfdgd
some text followed by %% comment to not delete that contains a % somewhere
some text followed by hello there
答案 4 :(得分:0)
使用sed,指令的顺序可能很重要。例如:
$ sed -ne '/^% /d; /[^%]%.*/ {s/%.*//; n}; p' /tmp/corpus
%% comment to do not delete
%% comment to do not delete
and some more text %% comment to do not delete
在此示例中,sed脚本按此顺序执行其任务:
此脚本适用于您在问题中提供的语料库。不保证在没有修改的情况下与任何其他语料库一起使用,如果您附加到模式空间的行包含注释字符,则显然不起作用。