Question

我正在重新格式化文件，我想执行以下步骤：

将双重CRLF替换为临时字符序列（ $CRLF$ 或其他）
删除整个文件中的所有CRLF
返回并替换双CRLF。

所以像这样输入：

This is a paragraph
of text that has
been manually fitted
into a certain colum
width.

This is another
paragraph of text
that is the same.

将成为

This is a paragraph of text that has been manually fitted into a certain colum width.

This is another paragraph of text that is the same.

看起来应该可以通过一些简单的sed程序来管理输入，但我不确定如何在CRLF中引用sed（用于{ {1}}）。或者也许有更好的方法来做到这一点？

Answer 1

您可以使用sed在末尾用{CRLF}装饰所有行：

sed 's/$/<CRLF>/'

然后用tr

删除所有\ r \ n

| tr -d "\r\n"

然后用\ n

替换双CRLF

| sed 's/<CRLF><CRLF>/\n/g'

并删除剩余的CRLF。

有一个单行的sed在一个循环中完成了所有这些，但我现在似乎无法找到它。

Answer 2

尝试以下方法：

cat file.txt | sed 's/$/ /;s/^ *$/CRLF/' | tr -d '\r\n' | sed 's/CRLF/\r\n'/

这不是你给出的方法;这是做什么的如下：

在每行的末尾添加一个空格。
将任何仅包含空格（即空行）的行替换为“CRLF”。
删除任何换行符（CR和LF）。
将任何出现的字符串“CRLF”替换为Windows样式的换行符。

这适用于Cygwin bash。

Answer 3

重新定义问题

看起来你真正 尝试做的就是回流你的段落和单行空间。有很多方法可以做到这一点。

非Sed解决方案

如果您不介意在coreutils之外使用某些软件包，可以使用一些额外的shell实用程序来实现这一点：

dos2unix /tmp/foo fmt -w0 /tmp/foo | cat --squeeze-blank | sponge /tmp/foo unix2dos /tmp/foo

Sponge来自 moreutils 包，并允许您编写您正在阅读的同一文件。 dos2unix （或者 tofrodos ）软件包将允许来回转换您的行结尾，以便更容易地与期望Unix样式行结尾的工具集成。

Answer 4

这可能适合你（GNU sed）：

sed ':a;$!{N;/\n$/{p;d};s/\r\?\n/ /;ba}' file

Answer 5

我想知道为什么这不容易吗？

添加CRLF：

sed -e s / \ s + $ / $'\ r \ n'/＆lt; index.html＆gt; index_CRLF.html

删除CRLF ... go unix：

sed -e s / \ s + $ / $'\ n'/＆lt; index_CRLF.html＆gt;的index.html

替换包含CRLF的字符串？

5 个答案:

重新定义问题

非Sed解决方案