Question

我正在尝试进行一些文本处理，但无法弄明白，问题是：

我的文件看起来像这样：

ORANGE{  
a
b
c 
CI 123  
d
e
f
g
} 

APPLE{  
1
2
3
4
5  
CI 123  
6
7  
}  

ORANGE{  
A
B
C  
CI 321  
D
E  
} 
ORANGE{
hell
CI 123
ABCD 1234
hmmm
}

awk '/ORANGE {/ {sho=1} ;/^CI 123$/ {sho=1} ;/^}$/ {sho=0} sho ' file

我尝试了上面的代码，但它没有给我我想要的东西，而是它给了我所有的ORANGE SECTIONS。我广泛搜索，但没有找到任何相关信息。感谢。

Answer 1

$ cat temp 
ORANGE{  
...  
CI 123  
...
} 

APPLE{  
...  
CI 123  
...  
}  

ORANGE{  
...  
CI 321  
...  
} 
ORANGE{
...
CI 123
ABCD 1234
...
}
$ awk '/ORANGE/ {o=1;p=0} {if(o)arr[i++]=$0} /CI 123/ {if(o){for(key in arr) print arr[key];p=1}else{p=0} delete arr;i=0;next;} /}/ {if(p)print;p=0;delete arr;i=0;o=0;} o && p' temp 
ORANGE{  
...  
CI 123  
...
} 
ORANGE{
...
CI 123
ABCD 1234
...
}

以下是具有相同标识的脚本文件中的相同awk逻辑：

/ORANGE/ {
    o=1
    p=0
}
{
    if(o)
        arr[i++]=$0
}
/CI 123/ {
    if(o)
    {
        for(key in arr)
            print arr[key]
        p=1
    }
    else
        p=0
    delete arr
    i=0
    next
}
/}/ {
    if(p)
        print
    p=0
    delete arr
    i=0
    o=0
}
o && p

我们可以使用awk这样的脚本文件：

$ awk -f script.awk temp
ORANGE{  
...  
CI 123  
...
} 
ORANGE{
...
CI 123
ABCD 1234
...
}

Edit1：自定义数据

$ cat temp 
ORANGE{  
a
b
c 
CI 123  
d
e
f
g
} 

APPLE{  
1
2
3
4
5  
CI 123  
6
7  
}  

ORANGE{  
A
B
C  
CI 321  
D
E  
} 
ORANGE{
hell
CI 123
ABCD 1234
hmmm
}

$ awk '/ORANGE/ {o=1;p=0} {if(o)arr[i++]=$0} /CI 123/ {if(o){for(key in arr) print arr[key];p=1}else{p=0} delete arr;i=0;next;} /}/ {if(p)print;p=0;delete arr;i=0;o=0;} o && p' temp 
ORANGE{  
a
b
c 
CI 123  
d
e
f
g
} 
ORANGE{
hell
CI 123
ABCD 1234
hmmm
}

Answer 2

$ awk -v RS="" '/ORANGE/&&/CI 123/' file
ORANGE{  
...  
CI 123  
...
}

将AWK记录分隔符RS设置为空字符串使每个块成为记录。然后用AWK搜索你想要的2个字符串。

Answer 3

awk是你的朋友：

awk 'BEGIN{RS="}\n*";ORS="}\n";}/ORANGE.*CI 123\n/{print}' file

在此，您将}\n设置为IN / OUT记录分隔符并搜索模式：

ORANGE(anything)CI 123(newline)

在每条记录中，如果找到，则打印记录。

如何仅使用awk获取包含精确元素的节

3 个答案: