匹配引号内的换行符

时间:2018-08-27 19:14:23

标签: regex parsing vim

我需要删除引号内的每个换行符(用空格替换\ n),

 <tag>
     abc: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
     abcd: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"

     abcde: "TEXTTEXTTEXTTEXT
     TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT
     TEXT"

     abcdef:TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
   </tag>

成为这个:

<tag>
     abc: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
     abcd: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"

     abcde: "TEXTTEXTTEXTTEXT TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT TEXT"

     abcdef:TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
   </tag>

请注意,我可以处理包含多行的字段,我不希望出现任何换行符。

我能够替换文件中的所有换行符:

%s / \ n / /

并且我能够替换文件中的所有引号及其内容:

%s /".*"/ /

但是我无法匹配引号内的\ n。

%s /“。* \ n” / /

我该如何完成? 谢谢!

4 个答案:

答案 0 :(得分:1)

:%s/\v(\u)\n\s+(\u)/\1\2

\v .............. very magic (avoid a lot of backslashes)
\u .............. uppercase
\n .............. new line
\s+ ............. one space or more
( .............. start of regex group
) .............. end of regex group 

我们正在寻找大写字母,后跟换行,后跟任意数量的空格,最后再搜索大写字母。我们只将第1组和第2组放在一起。

答案 1 :(得分:0)

:g和系列按行工作,因此很难处理多行命令。您可以使用常规的:s

:%s/.*field\s*[^4]: "\_[^"]*"\n

考虑到您的引号中没有引号,此方法有效。

答案 2 :(得分:0)

如果用“ agroup”表示“删除”,这可能对您有用:

:%g/field/norm f"d/"/e^Mdd

其中^M Ctrl-V Enter

“查找包含文本“ field”的每一行,然后在该行中找到一个引号,删除直到下一个引号之后,然后删除整行。”

答案 3 :(得分:0)

sed可能会为您带来循环:

sed -E -e ':a' -e $'/^[^"]*"[^"]+$/{N;s/[[:blank:]]*\\n[[:blank:]]*/ /;}' -e 'ta' file

 <tag>
     abc: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
     abcd: "TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"

     abcde: "TEXTTEXTTEXTTEXT TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT TEXT"

     abcdef:TEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXTTEXT"
</tag>