我有一个包含数千行信息的output
文件。我经常在输出文件中找到以下形式的信息
Input Orientation:
...
content
...
Distance matrix (angstroms):
我现在要打印content
并保存到filename
。但是,上面发生在输出文件中的几个地方,我只想要输出文件中的最后一个条目。这是我到目前为止所尝试的内容
tac output | sed -n -e '/Distance matrix/,/Input orientation/p' > filename
但是,此打印将匹配模式的所有实例打印到filename
。
然后我读了GNU sed,其中安装了4.2.1版,以下内容应该有效:
tac output | sed -n -e '0,/Distance matrix/,/Input orientation/p' > filename
但这给了我一个错误:
sed: -e expression #1, char 20: unknown command: `,'
然后我尝试在匹配模式Input orientation
后要求sed退出:
tac output | sed -n -e '/Distance matrix/,/Input orientation/{p;q}' > filename
但现在最终只打印Distance matrix (angstroms):
到filename
我确定如果可能的话,我只是无法搞清楚!我没有使用awk的经验,所以我更喜欢使用sed的答案。
用于测试的示例输出文件:
Item Value Threshold Converged?
Maximum Force 0.005032 0.000450 NO
RMS Force 0.001066 0.000300 NO
Maximum Displacement 0.027438 0.001800 NO
RMS Displacement 0.007282 0.001200 NO
Predicted change in Energy=-8.909077D-05
GradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGrad
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Incorrect Incorrect Incorrect
2 1 0 Incorrect Incorrect Incorrect
3 1 0 Incorrect Incorrect Incorrect
4 1 0 Incorrect Incorrect Incorrect
5 17 0 Incorrect Incorrect Incorrect
6 9 0 Incorrect Incorrect Incorrect
---------------------------------------------------------------------
Distance matrix (angstroms):
1 2 3 4 5
1 C 0.000000
2 H 1.080163 0.000000
3 H 1.080326 1.809416 0.000000
4 H 1.080621 1.810236 1.810685 0.000000
5 Cl 1.962171 2.470702 2.468769 2.465270 0.000000
6 F 2.390537 2.343910 2.357275 2.380515 4.352568
6
6 F 0.000000
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Correct Correct Correct
2 1 0 Correct Correct Correct
3 1 0 Correct Correct Correct
4 1 0 Correct Correct Correct
5 17 0 Correct Correct Correct
6 9 0 Correct Correct Correct
---------------------------------------------------------------------
Distance matrix (angstroms):
1 2 3 4 5
1 C 0.000000
2 H 1.080516 0.000000
3 H 1.080587 1.801890 0.000000
4 H 1.080473 1.801427 1.801478 0.000000
5 Cl 1.936014 2.458132 2.459437 2.460630 0.000000
6 F 2.414588 2.368281 2.365651 2.355690 4.350586
答案 0 :(得分:1)
这是因为sed
会在看到q
后立即退出。你需要再次获得资格
$ tac ip.txt | sed -n '/Distance matrix/,/Input orientation/{p;/Input orientation/q}' | tac
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 6 0 Correct Correct Correct
2 1 0 Correct Correct Correct
3 1 0 Correct Correct Correct
4 1 0 Correct Correct Correct
5 17 0 Correct Correct Correct
6 9 0 Correct Correct Correct
---------------------------------------------------------------------
Distance matrix (angstroms):
使用awk
tac ip.txt | awk '/Distance matrix/{f=1} f; /Input orientation/{exit}' | tac
答案 1 :(得分:0)
sed whithout tac的另一种解决方案
sed ':B;$x;/Input/!d;x;s/.*//;;x;:A;/Distance/!{N;bA};h;N;s/.*\n//;bB' infile
将文本保留在保留空间中,并在找到新文本时将其删除。
答案 2 :(得分:0)
替代awk
没有tac
$ awk '/Input orientation/ {f=1}
f {a=a sep $0; sep=ORS}
/Distance matrix/ {f=0; b=a; a=sep=""}
END {print b}' file
在每个结束标记之后传输并重置缓存并打印最后一个标记。
答案 3 :(得分:0)
这可能适合你(GNU sed):
sed '/Input orientation/h;//!H;$!d;x;s/^\(Input orientation.*Distance matrix[^\n]*\).*/\1/p;d' file
每次出现Input orientation
时,用当前行覆盖保留空间(HS),追加以下行并删除所有行。在文件末尾,切换到HS并删除Distance matrix
后的行并打印。
替代方案,沿着相同的路线,但可能更少的内存密集型:
sed '/Input orientation/h;//!{x;/./G;x};$!d;x;s/\(Distance matrix[^\n]*\).*/\1/p;d' file