Question

我一直在尝试过滤一个文件，该文件有几个重复的行，如下所示：

('hello
My name is
jamie
blabla
xyz>>)
('hello
My name is
kat
blabla
blablabla
x2>>)
('hello
My name is
oliver
xv>>)

我正在尝试将（'和＆gt;＆gt;）之间的所有行合并为一行，然后grep一个模式。

awk '/('hello/{if (NR!=1)print "";next}{print $0}END{print "";}'

似乎会产生一些奇怪的结果，通过在这些行之间添加一个额外的空白行，但不确定我是否可以以某种方式组合它们。

通过合并线条，我期待输出如下：

('hello My name is jamie blabla xyz>>)
('hello My name is kat blabla blablabla x2>>)
('hello My name is oliver xv>>)

我可以从中获取任何价值。

感谢。

Answer 1

你不需要合并线然后grep - 只需使用awk并在一个简洁的脚本中完成它们。使用GNU awk进行多字符RS：

$ awk -F'\n' 'BEGIN{RS=ORS=")\n"} /hello/{$1=$1;print}' file
('hello My name is jamie blabla xyz>>)
('hello My name is kat blabla blablabla x2>>)
('hello My name is oliver xv>>)

$ awk -F'\n' 'BEGIN{RS=ORS=")\n"} /jamie/{$1=$1;print}' file
('hello My name is jamie blabla xyz>>)

Answer 2

使用sed

Sedtest.sed

/('/{:1;N;/>>)/!b1;/hello/{s/\n/ /gp}};d

或扩展

/('/{
#Search for start string
   :1
#Label to loop to
   N
#Get next line
   />>)/!b1
#Break to label until end pattern is matched
   /hello/{
#When that loops done search for hello in block
   s/\n/ /gp
#Change newlines to space and print
   }
}
d
#Delete everything ever

文件

('hello
My name is
jamie
blabla
xyz>>)
('hello
My name is
kat
blabla
blablabla
x2>>)
('hello
My name is
oliver
xv>>)

执行

sed -f Sedtest file

可生产

('hello My name is jamie blabla xyz>>)
('hello My name is kat blabla blablabla x2>>)
('hello My name is oliver xv>>)

Answer 3

$ tr $'\n' ' ' < infile | grep -o "('hello[^(]*)"
('hello My name is jamie blabla xyz>>)
('hello My name is kat blabla blablabla x2>>)
('hello My name is oliver xv>>)

tr用空格替换所有换行符，grep提取以'hello开头的括号中的所有表达式。

Answer 4

使用perl，我会这样做：

#!/usr/bin/env perl

use strict;
use warnings;

local $/ = ")\n";

while  ( <DATA> ) { 
    s/\n(?!$)/ /g;
    print if /hello/;
}

__DATA__
('hello
My name is
jamie
blabla
xyz>>)
('hello
My name is
kat
blabla
blablabla
x2>>)
('hello
My name is
oliver
xv>>)

这会明确删除换行符，以匹配您想要的结果。但是，您实际上并不需要：

while  ( <DATA> ) { 
    print if /jamie/;
}

工作正常，并给出：

('hello
My name is
jamie
blabla
xyz>>)

为了清楚起见，我们写了很长的手 - 你可以减少到一个班轮：

perl -ne 'BEGIN{$/=')'} print if m/jamie/' filename

（这也接受管道信息）。

Grep表示字符串，但在边界内

4 个答案: