使用分隔符分隔awk文件

时间:2012-07-27 21:50:50

标签: linux shell awk

鉴于此输入文件:

SectionMarker and some random text here
this is the text in section 1 
this is the text in section 1
this is the text in section 1
etc
SectionMarker and some random text here also
this is the text in section 2 
this is the text in section 2
this is the text in section 2
etc
SectionMarker and some random text too
this is the text in section 3 
this is the text in section 3
this is the text in section 3
etc

如何使用awk或sed或其他方式将此文件拆分?

这是我尝试过的,但没有奏效:

awk -vRS='\SectionMarker[:print:]\n' 'NR==1 {print}' ./data.log 

2 个答案:

答案 0 :(得分:0)

这个单行应该做的工作:

 awk '{x+=($0~/^SectionMarker/)?1:0}x<2' data.log

测试

kent$  cat data.log
SectionMarker and some random text here
this is the text in section 1 
this is the text in section 1
this is the text in section 1
etc
SectionMarker and some random text here also
this is the text in section 2 
this is the text in section 2
this is the text in section 2
etc
SectionMarker and some random text too
this is the text in section 3 
this is the text in section 3
this is the text in section 3
etc

kent$  awk '{x+=($0~/^SectionMarker/)?1:0}x<2' data.log
SectionMarker and some random text here
this is the text in section 1 
this is the text in section 1
this is the text in section 1
etc

我不知道你想要抓住真正问题中有多少部分。您可以在该联机中放置exit;以使awk不处理整个文件。

答案 1 :(得分:0)

以下是使用awk和读取分隔符的一种方法。我添加了一个名为var的额外变量,这可以用来轻松选择所需的“部分”。例如,当var=2时,将打印第2部分。即:

awk -v var=2 -v RS='SectionMarker[^\n]*' -v FS='\n' 'NR == var + 1 { for (i=2; i<NF; i++) print $i }' file.txt

打印:

this is the text in section 2 
this is the text in section 2
this is the text in section 2
etc

HTH