Question

我有以下格式的一堆文件。

A.TXT ：

some text1      
more text2    
XXX
more text  
....  
XXX
.  
.  
XXX 
still more text  
text again

每个文件至少有3行以XXX开头。现在，对于每个文件A.txt，我想写出所有行，直到XXX的第3次出现（在上面的示例中，直到still more text之前的行）才归档A_modified.txt }。

我想在bash中执行此操作并提出grep -n -m 3 -w "^XXX$" * | cut -d: -f2以获取每个文件中的相应行号。

是否可以使用head以及这些行号来生成所需的输出？

PS：我知道一个简单的python脚本可以完成这项工作，但我正在尝试用这个bash做的，没有特别的原因。

Answer 1

更简单的方法是使用awk。假设您目前的工作目录中只有感兴趣的文件，请尝试：

for i in *; do awk '/^XXX$/ { c++ } c<=3' "$i" > "$i.modified"; done

或者如果您的文件非常大：

for i in *; do awk '/^XXX$/ { c++ } c>=3 { exit }1' "$i" > "$i.modified"; done

Answer 2

head -n将打印出文件的第一行“n”行

#!/bin/sh

for f in `ls *.txt`; do
  echo "searching $f" 

  line_number=`grep -n -m 3 -w "^XXX$" $f | cut -d: -f1 | tail -1` 

  # line_number now stores the line of the 3rd XXX 

  # now dump out the first 'line_number' of lines from this file
  head -n $line_number $f
done

头输出直到特定线

2 个答案: