我试图从包含这种格式文本的大文件中提取多次测试
CL blahblahblah
SP blahblahblah blahblahblah blahblahblah
DE blahblahblahblahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah
AB blahblahblah blahblahblah blahblahblah
blahblahblahblahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah
C1 blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
lahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
RP blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah
EM blahblahblah blahblahblah blahblahblah blahblahblah
NR blahblahblah blahblahblah blahblahblah blahblahblah
TC blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah
Z9 blahblahblah blahblahblah blahblahblah blahblahblah
PU blahblahblah blahblahblah blahblahblah blahblahblah
PI blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
我只对以C1,AB,TI开头的条目感兴趣,但有时这些条目跨越多行,并且跟随它们的XX标记行并不总是相同。有没有一种简单的方法只保留这些条目? 所以我剩下的文字应该是这样的:
TI blahblahblah
AB blahblahblah b lah blahblah blah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
C1 blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah
TI blah blah blah blah blah blah
AB blahblahblah blahblahblah blahblahblah blahblahblahblahblahblah blahblahblah blahblahblah blahblahblah blahblahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah
C1 blahblahblah blahblahblah blahblahblah blahblahblahblahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
依此类推......
非常感谢!
答案 0 :(得分:3)
这应该有效:
:let @a="" | g/^\v<(C1|AB|TI)>/norm! "Ay/^\S^M
编辑特定于Windows:您需要在该行中添加“返回”,键入^M
为 Cq 输入 (或C-v
如果你没有使用Windows或你的vimrc没有设置behave mswin
)
获取寄存器"a
中的行。用这些行替换缓冲区:
:%d | put a
或者,将其放入新缓冲区:
:new | put a
答案 1 :(得分:3)
我愿意:
:$put='X' | 1,$-1g/^\(\s\|C1\|AB\|TI\)\@!/ ,/^\S/-d
:$d
这将执行以下操作:
1,$-1
)之外的每一行,如果它以非空格开头并且不以C1,AB或TI(g/pattern/
)开头,则删除(d
)直到不包含空格,/pattern/
的下一行(-
的缩写为-1
)为了尝试使用Gvim:
:@+
(从链接到剪贴板的+寄存器播放Ex命令)。我得到了什么:
AB blahblahblah blahblahblah blahblahblah
blahblahblahblahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah
C1 blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
lahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
答案 2 :(得分:3)
awk
解决方案:awk '
BEGIN{
tags["C1"]
tags["AB"]
tags["TI"]
}
{
match($0, /^\w+/)
if(RSTART)
t=substr($0, RSTART, RLENGTH)
}
t in tags' input.txt
:g/^/let t=matchstr(getline('.'), '^\w\+') | if !empty(t) | let tag=t | endif | if index(['C1', 'AB', 'TI'], tag)==-1 | d | endif
答案 3 :(得分:2)
这似乎有效,但在文件末尾留下一个空白行。
:%s/\v^(C1|AB|TI|\s)@!\_.{-}\n(C1|AB|TI|$)@=//
这个正则表达式使用了一些棘手的功能,我将尝试解释。
\v
说这种模式“非常神奇”,只是让我们在几个地方跳过反斜杠。^(C1|AB|TI|\s)@!
匹配任何不以目标代码或空格开头的行。 \_.
匹配任何字符,包括换行符。{-}
尽可能少地匹配前一个原子(非贪婪)。\n
匹配一行的结尾。(C1|AB|TI|$)@=
匹配目标代码或行尾(对于最终案例),宽度为零。测试输入的结果如下:
AB blahblahblah blahblahblah blahblahblah
blahblahblahblahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah
C1 blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah blahblahblah
答案 4 :(得分:0)
另一个awk在线人员:
awk -F' |\t' '{if($1)f=$1~/CI|AB|C1/?1:0}f' yourFile