Question

我有一个包含

等行的文件

I want a lot <*tag 1> more <*tag 2>*cheese *cakes.

我想在<>内删除*，但不在外面。标签可能比上面更复杂。例如，<*better *tag 1>。

我尝试/\bregex\b/s/\*//g，它适用于代码1，但不适用于代码2.那么我如何才能使其适用于代码2？

非常感谢。

Answer 1

如果标签

中只有一个星号，则为简单解决方案

sed 's/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g'

如果你有更多，你可以使用sed转到标签系统

sed ':doagain s/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g; t doagain'

其中 doagain 是循环标签， t doagain 是条件跳转到标签doagain。请参阅sed手册：

t label

 Branch to label only if there has been a successful substitution since the last 
 input line was read or conditional branch was taken. The label may be omitted, in 
 which case the next cycle is started.

Answer 2

强制性Perl解决方案：

perl -pe '$_ = join "",
        map +($i++ % 2 == 0 ? $_ : s/\*//gr),
        split /(<[^>]+>)/, $_;' FILE

<强>附加：

perl -pe 's/(<[^>]+>)/$1 =~ s(\*)()gr/ge' FILE

Answer 3

awk 可以解决您的问题：

awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file

更具可读性的版本：

 awk '{x=split($0,a,/<[^>]*>/,s)
       for(i in s)gsub(/\*/,"",s[i])
       for(j=1;j<=x;j++)r=r a[j] s[j]
       print r}' file

使用您的数据进行测试：

kent$  cat file
I want a lot <*tag 1> more <*tag 2>*cheese *cakes. <*better *tag X*>

kent$  awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file
I want a lot <tag 1> more <tag 2>*cheese *cakes. <better tag X>

在sed中替换/删除匹配字符串中的特殊字符

3 个答案: