我想要做的只是连接2个文件,如下例所示:
file 1 file 2
C1 O1
C3 O3
.. O5
O7
O9
O11
O13
O15
O17
O19
..
所需的输出文件是:
file 3
C1
O1
O9
O17
C3
O3
O11
O19
..
..
因此,模式是:首先是带有O1的C1,然后是文件2中的3行(所以,打印O9);然后在文件2中另外3行(所以,打印O17)。然后在文件2(O10)中打出C3和O3,3行,3行(O18);那么C5 ......等等。
我尝试用cat | paste - - - ...
做一些事情,但它不起作用:(
有什么建议吗?
非常感谢提前
修改
我忘了告诉你他们是大文件。 :)
这是我的输入文件
cat file 1
C 18 -2.182951850 -0.000000000 -6.517815410
C 20 -4.127401075 0.000000000 -0.446529291
C 22 -3.314258919 -2.494999886 -15.624910016
C 24 -6.071850300 0.000000000 5.624757806
C 26 -2.023950100 0.000000000 5.624757806
C 28 -4.286402584 -0.000000000 -12.589102506
C 30 -6.230851809 -0.000000000 -6.517815410
C 32 -0.079500634 0.000000000 -0.446529291
cat file 2
O 34 -1.393125174 -0.640765928 -5.738276269
O 36 -3.337574640 -0.640765928 0.333010828
O 38 -2.524270589 1.854234106 -14.845370570
O 40 -5.282024106 -0.640765928 6.404297925
O 42 -2.182951850 1.281531856 -6.517815410
O 44 -4.127401075 1.281531856 -0.446529291
O 46 -3.314258919 -1.213468178 -15.624910016
O 48 -6.071850300 1.281531856 5.624757806
O 50 -2.972778044 -0.640765928 -7.297355528
O 52 -4.917227269 -0.640765928 -1.226068432
O 54 -4.104085113 1.854234106 -16.404449463
O 56 -6.861676614 -0.640765928 4.845217687
O 58 -2.813776294 0.640765779 4.845217687
O 60 -5.076228778 0.640765779 -13.368642136
O 62 -7.020678123 0.640765779 -7.297355528
O 64 -0.869326828 0.640765779 -1.226068432
O 66 -2.023950100 -1.281531708 5.624757806
O 68 -4.286402584 -1.281531708 -12.589102506
O 70 -6.230851809 -1.281531708 -6.517815410
O 72 -0.079500634 -1.281531708 -0.446529291
O 74 -1.234123906 0.640765779 6.404297925
O 76 -3.496576390 0.640765779 -11.809563365
O 78 -5.441025615 0.640765779 -5.738276269
O 80 0.710325077 0.640765779 0.333010828
C18必须紧跟O34,O42和O50。然后C20接着是O36,O44和O52,依此类推:
cat file 3
C 18 -2.182951850 -0.000000000 -6.517815410
O 34 -1.393125174 -0.640765928 -5.738276269
O 42 -2.182951850 1.281531856 -6.517815410
O 50 -2.972778044 -0.640765928 -7.297355528
C 20 -4.127401075 0.000000000 -0.446529291
O 36 -3.337574640 -0.640765928 0.333010828
O 44 -4.127401075 1.281531856 -0.446529291
O 52 -4.917227269 -0.640765928 -1.226068432
.. .. ............ ............. .........
Tom代码生成的输出是:
Tom output
C 18 -2.182951850 -0.000000000 -6.517815410
O 34 -1.393125174 -0.640765928 -5.738276269
O 42 -2.182951850 1.281531856 -6.517815410
O 50 -2.972778044 -0.640765928 -7.297355528
O 58 -2.813776294 0.640765779 4.845217687
O 66 -2.023950100 -1.281531708 5.624757806
O 74 -1.234123906 0.640765779 6.404297925
C 20 -4.127401075 0.000000000 -0.446529291
O 36 -3.337574640 -0.640765928 0.333010828
O 44 -4.127401075 1.281531856 -0.446529291
O 52 -4.917227269 -0.640765928 -1.226068432
O 60 -5.076228778 0.640765779 -13.368642136
O 68 -4.286402584 -1.281531708 -12.589102506
O 76 -3.496576390 0.640765779 -11.809563365
C 22 -3.314258919 -2.494999886 -15.624910016
O 38 -2.524270589 1.854234106 -14.845370570
O 46 -3.314258919 -1.213468178 -15.624910016
O 54 -4.104085113 1.854234106 -16.404449463
O 62 -7.020678123 0.640765779 -7.297355528
O 70 -6.230851809 -1.281531708 -6.517815410
O 78 -5.441025615 0.640765779 -5.738276269
and so on
有什么建议吗?
谢谢
答案 0 :(得分:2)
我建议使用awk来执行此操作:
# first file
NR == FNR {
a[NR] = $0 # save each line into array
++len
next # skip further blocks
}
{ b[FNR] = $0 } # save each line from 2nd file into array
END {
# loop through and print
for (i = 1; i <= len; ++i) {
print a[i]
for (j = i; j <= FNR; j += 4) print b[j]
}
}
脚本可以像awk -f script.awk file1 file2
一样运行。
答案 1 :(得分:1)
您所描述的内容(通过评论中的确认)是一种模式
要处理这个问题,我会使用带有9行“滑动窗口”的awk作为缓冲区。
而不是使用Tom的解决方案,将两个文件顺序指向awk并将其读入一个数组,我建议同时从两个文件中读取,这样就不会占用太多内存来保存数组。
这就是我的意思,作为一个单行:
awk '{a[NR]=$0;delete a[NR-10];} NR>9{getline Cline < "fileC";print Cline;print a[NR-9]; print a[NR-5]; print a[NR-1];}' fileO
为了便于阅读(和评论)而分解,这看起来像:
awk '
{
a[NR]=$0; # Store our current "O" line in an array
delete a[NR-10]; # Clean the array as we step through the file
}
NR>9 {
getline Cline < "fileC"; # Get the next "C" line...
print Cline; # ... and print it
print a[NR-9]; # \
print a[NR-5]; # > Print the three "O" lines for this
print a[NR-1]; # /
}
' fileO
请注意,您有正确数量的“O”行,因为如果最后一组“O”行不完整,则不会打印。
我的示例数据的输出如下所示:
C 18 -2.182951850 -0.000000000 -6.517815410
O 34 -1.393125174 -0.640765928 -5.738276269
O 42 -2.182951850 1.281531856 -6.517815410
O 50 -2.972778044 -0.640765928 -7.297355528
C 20 -4.127401075 0.000000000 -0.446529291
O 36 -3.337574640 -0.640765928 0.333010828
O 44 -4.127401075 1.281531856 -0.446529291
O 52 -4.917227269 -0.640765928 -1.226068432
C 22 -3.314258919 -2.494999886 -15.624910016
O 38 -2.524270589 1.854234106 -14.845370570
O 46 -3.314258919 -1.213468178 -15.624910016
O 54 -4.104085113 1.854234106 -16.404449463
C 24 -6.071850300 0.000000000 5.624757806
O 40 -5.282024106 -0.640765928 6.404297925
O 48 -6.071850300 1.281531856 5.624757806
O 56 -6.861676614 -0.640765928 4.845217687
C 26 -2.023950100 0.000000000 5.624757806
O 42 -2.182951850 1.281531856 -6.517815410
O 50 -2.972778044 -0.640765928 -7.297355528
O 58 -2.813776294 0.640765779 4.845217687
C 28 -4.286402584 -0.000000000 -12.589102506
O 44 -4.127401075 1.281531856 -0.446529291
O 52 -4.917227269 -0.640765928 -1.226068432
O 60 -5.076228778 0.640765779 -13.368642136
C 30 -6.230851809 -0.000000000 -6.517815410
O 46 -3.314258919 -1.213468178 -15.624910016
O 54 -4.104085113 1.854234106 -16.404449463
O 62 -7.020678123 0.640765779 -7.297355528
C 32 -0.079500634 0.000000000 -0.446529291
O 48 -6.071850300 1.281531856 5.624757806
O 56 -6.861676614 -0.640765928 4.845217687
O 64 -0.869326828 0.640765779 -1.226068432
C 32 -0.079500634 0.000000000 -0.446529291
O 50 -2.972778044 -0.640765928 -7.297355528
O 58 -2.813776294 0.640765779 4.845217687
O 66 -2.023950100 -1.281531708 5.624757806
C 32 -0.079500634 0.000000000 -0.446529291
O 52 -4.917227269 -0.640765928 -1.226068432
O 60 -5.076228778 0.640765779 -13.368642136
O 68 -4.286402584 -1.281531708 -12.589102506
C 32 -0.079500634 0.000000000 -0.446529291
O 54 -4.104085113 1.854234106 -16.404449463
O 62 -7.020678123 0.640765779 -7.297355528
O 70 -6.230851809 -1.281531708 -6.517815410
C 32 -0.079500634 0.000000000 -0.446529291
O 56 -6.861676614 -0.640765928 4.845217687
O 64 -0.869326828 0.640765779 -1.226068432
O 72 -0.079500634 -1.281531708 -0.446529291
C 32 -0.079500634 0.000000000 -0.446529291
O 58 -2.813776294 0.640765779 4.845217687
O 66 -2.023950100 -1.281531708 5.624757806
O 74 -1.234123906 0.640765779 6.404297925
C 32 -0.079500634 0.000000000 -0.446529291
O 60 -5.076228778 0.640765779 -13.368642136
O 68 -4.286402584 -1.281531708 -12.589102506
O 76 -3.496576390 0.640765779 -11.809563365
C 32 -0.079500634 0.000000000 -0.446529291
O 62 -7.020678123 0.640765779 -7.297355528
O 70 -6.230851809 -1.281531708 -6.517815410
O 78 -5.441025615 0.640765779 -5.738276269
这是你的意思吗?