Question

我正在尝试执行this之类的操作，但是对于引用的电子邮件，所以这个

On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote:                                                                                                                                                                                                                                                       
>Hi Everyone,                                                                                                                                                                                                                                                                                                                 
>                                                                                                                                                                                                                                                                                                                             
>                                                                                                                                                                                                                                                                                                                              
>                                                    
>I love spaces.
>                                                                                                                                                                                                                                                                                                                             
>                                                                                                                                                                                                                                                                                                                          
>                                                                                                                                                                                                                                                                                                                          
>That's all.

会成为这个

On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote:                                                                                                                                                                                                                                                       
>Hi Everyone,                                                                                                                                                                                                                                                                                                                 
>                                                                                                                                                                                                                                                                                                                             
>I love spaces.
>                                                                                                                                                                                                                                                                     
>That's all.

由于

Answer 1

假设每条视觉线都是一条正确的逻辑线（以\n结尾的字符串），您可以省去其余的工具，只需在输入上运行uniq(1)即可。

示例如下。

% cat tst
>Hi Everyone,
>
>
>
>I love spaces.
>
>
>
>That's all.

% uniq tst
>Hi Everyone,
>
>I love spaces.
>
>That's all.
%

Answer 2

试试这个：

sed -r '/^>\s*$/{N;/^>\s*\n>\s*$/D}'

以下是解释：

使用的命令：

N将下一行输入附加到模式空间。
D删除模式空间中的第一个嵌入换行符。开始下一个循环，但如果仍然存在则跳过读取输入模式空间中的数据。

使用的模式：

/^>\s*$/匹配包含'＆gt;'的行
/^>\s*\n>\s*$/匹配包含>的两个连续行，当与N一起使用时，其中包含零个或多个空格

所以上面的sed命令的工作流程是：

将一行读入模式空间（如果符合文件末尾，退出）
如果图案空间仅包含'＆gt;'转到第4步，否则转到第3步
在模式空间中打印上下文并转到步骤1
追加'\ n'和下一行到模式空间，如果模式空间只包含'＆gt; \ n＆gt;'（这意味着我们遇到两条连续'＆gt;'行）转到第5步，否则转到第3步
删除'\ n'（包含）之前的上下文，然后转到第2步

Answer 3

sed '/^>\s\s*$/d;$b;/^[^>]/b;a>'  input

意味着：

/^>\s\s*$/d：删除所有包含>和空格的行。

$b;/^[^>]/b：打印并跳过最后一行，不是以>开头的行。

a>：在所有其他行之后添加>。

给出：

On 2014-07-11 at 03:36 PM, <ilovespaces@email.com> wrote:
>Hi Everyone,
>
>I love spaces.
>
>That's all.

Answer 4

另一种基于awk的解决方案：

awk '{ /^>\s*$/?b++:b=0; if (b<=1) print }' file

故障：

/^>\s*$/?b++:b=0
    - ? :       the ternary operator
    - /^>\s*$/  matches a blank line starts with ">"
    - b         variable that counts consecutive blank lines (b++).
                however, if the current line is non-blank, b is reset to 0.


if (b<=1) print
    print if the current line is non-blank (b==0)
          or if there is only one blank line (b==1).

Answer 5

awk way

这实际上考虑了与其他答案不同的空间（除了perreals :)）它也不会只在>的每一行之后插入一个>（意味着如果有多行文本，则不会在它们之间插入空白行。）

awk 'a=/^>[ ]*$/{x=$1}!a&&x{print x;x=0}!a' file

解释

a=/^>[ ]*$/                    Sets a to pattern. Pattern is begins with > and 
                               then has  only spaces till end

{x=$1}                        Sets x to $1.

!a&&x                         While it does not match a(the pattern) and x is 0

{print x;x=0}                 Print x(>) and set x to zero

!a                            If it is not a(the pattern) print the line

这项工作的方式是将x设置为＆gt;当它找到只包含＆gt;的行时和空格。
然后继续，直到找到一条不匹配的行，打印＆gt;并打印线。每次重新找到模式时都会重置

希望这会有所帮助：）

回复电子邮件：如何压缩多个＆＃34;空白＆＃34; （不是真的空白;只包含＆＃34;＆gt;＆＃34;的行）成一行？

5 个答案:

以下是解释：