我有3个文件如下,所有3个文件都有相同数量的col和row(超过数百个)。我想要的是:如果File1和File2中的数字落在特定范围内,则找到col / row,然后将File3中的数字保持为相同的索引,并将“0”设置为其他数字。例如:从File1和File2,只有col2 / row2处的数字可以满足标准(0 <88 <100,0 <6 <10),然后从File3保持数字8并将“0”分配给所有其他数字。是否可以使用awk来做到这一点?还是蟒蛇?谢谢。
File1中:
-10 -10 9
-20 88 106
-30 300 120
文件2:
-6 0 -7
-5 6 1
-2 18 32
文件3:
4 3 5
6 8 8
10 23 14
输出
0 0 0
0 8 0
0 0 0
答案 0 :(得分:1)
关注awk
会有所帮助。
awk '
FNR==1 { count++ } ##Checking condition if FNR==1 then increment variable count with 1 each time.
count==1 { ##Checking condition if count is either 1 or 2 if yes then do following.
for(i=1;i<=NF;i++) { ##Starting a usual for loop from variable value 1 to till value of NF here and doing following.
if($i>0 && $i<100){ a[FNR,i]++ } ##Checking condition if a field value is greater than 0 and lesser than 100 then increment 1 count for array a whose index is line_number and column_number here. So this will have the record of which ever line whichever column has values in range and if count is 2 then we should print it.
}}
count==2 {
for(i=1;i<=NF;i++) {
if($i>0 && $i<10) { a[FNR,i]++ }
}}
count==3 { ##Checking condition if variable count is 3 here then do following.
for(j=1;j<=NF;j++) { $j=a[FNR,j]==2?$j:0 }; ##Starting a for loop here from 1 to till NF value and checking condition if array a with index of line_number and column_number is 2(means both File1 and File2 have same ranges) then keep its same value else make it 0 as per OP request.
print } ##Printing the current line edited/non-edited value here.
' File1 File2 File3 ##Mentioning all Input_file(s) here.
输出如下。
0 0 0
0 8 0
0 0 0
答案 1 :(得分:1)
你有一个很棒的awk
答案。
以下是使用numpy在Python中执行此操作的方法。
首先,阅读文件:
import numpy as np
arrays=[]
for fn in ('file1', 'file2', 'file3'):
with open(fn) as f:
arrays.append(np.array([line.split() for line in f],dtype=float))
然后创建一个掩码矩阵来过滤所需的条件:
mask=(arrays[0]>0) & (arrays[0]<100) & (arrays[1]>0) & (arrays[1]<10)
然后通过掩码将第三个数组(arrays[2]
是第三个文件)相乘:
>>> arrays[2] * mask.astype(float)
[[0. 0. 0.]
[0. 8. 0.]
[0. 0. 0.]]