我有以下数据框
genome1 genome2 percent1 percent2 ANI
0 GCA_000969925 GCF_000969885 0.10 0.35 99.095220
1 GCA_000969925 GCF_000969885 0.10 0.40 98.495660
2 GCA_000969925 GCF_000969885 0.10 0.45 98.669918
3 GCA_000969925 GCF_000969885 0.10 0.50 98.500838
4 GCA_000969925 GCF_000969885 0.10 0.55 98.601606
5 GCA_000969925 GCF_000969885 0.10 0.60 98.722015
6 GCA_000969925 GCF_000969885 0.10 0.65 98.730725
7 GCA_000969925 GCF_000969885 0.10 0.70 98.682221
8 GCA_000969925 GCF_000969885 0.10 0.75 99.094596
9 GCA_000969925 GCF_000969885 0.10 0.80 98.828468
10 GCA_000969925 GCF_000969885 0.10 0.85 99.090124
11 GCA_000969925 GCF_000969885 0.10 0.90 99.112024
12 GCA_000969925 GCF_000969885 0.10 0.95 99.192380
13 GCA_000969925 GCF_000969885 0.10 1.00 99.376812
14 GCA_000969925 GCF_000969885 0.15 0.20 97.009700
15 GCA_000969925 GCF_000969885 0.15 0.25 97.244750
16 GCA_000969925 GCF_000969885 0.15 0.30 97.904578
17 GCA_000969925 GCF_000969885 0.15 0.35 98.077026
18 GCA_000969925 GCF_000969885 0.15 0.40 98.348245
19 GCA_000969925 GCF_000969885 0.15 0.45 98.280009
20 GCA_000969925 GCF_000969885 0.15 0.50 98.434687
21 GCA_000969925 GCF_000969885 0.15 0.55 98.491112
22 GCA_000969925 GCF_000969885 0.15 0.60 98.547068
23 GCA_000969925 GCF_000969885 0.15 0.65 98.650248
24 GCA_000969925 GCF_000969885 0.15 0.70 98.878492
25 GCA_000969925 GCF_000969885 0.15 0.75 98.897532
26 GCA_000969925 GCF_000969885 0.15 0.80 98.985684
27 GCA_000969925 GCF_000969885 0.15 0.85 99.038040
28 GCA_000969925 GCF_000969885 0.15 0.90 99.233776
29 GCA_000969925 GCF_000969885 0.15 0.95 99.245952
30 GCA_000969925 GCF_000969885 0.15 1.00 99.397208
31 GCA_000969925 GCF_000969885 0.20 0.15 97.806960
32 GCA_000969925 GCF_000969885 0.20 0.20 98.707025
33 GCA_000969925 GCF_000969885 0.20 0.25 98.452431
34 GCA_000969925 GCF_000969885 0.20 0.30 97.922750
35 GCA_000969925 GCF_000969885 0.20 0.35 98.095552
36 GCA_000969925 GCF_000969885 0.20 0.40 97.911546
37 GCA_000969925 GCF_000969885 0.20 0.45 98.189728
38 GCA_000969925 GCF_000969885 0.20 0.50 98.340308
39 GCA_000969925 GCF_000969885 0.20 0.55 98.437920
40 GCA_000969925 GCF_000969885 0.20 0.60 98.542968
41 GCA_000969925 GCF_000969885 0.20 0.65 98.847280
42 GCA_000969925 GCF_000969885 0.20 0.70 98.859392
43 GCA_000969925 GCF_000969885 0.20 0.75 98.952128
44 GCA_000969925 GCF_000969885 0.20 0.80 98.908232
45 GCA_000969925 GCF_000969885 0.20 0.85 99.173704
46 GCA_000969925 GCF_000969885 0.20 0.90 99.265132
47 GCA_000969925 GCF_000969885 0.20 0.95 99.287808
48 GCA_000969925 GCF_000969885 0.20 1.00 99.408728
49 GCA_000969925 GCF_000969885 0.25 0.10 98.555500
50 GCA_000969925 GCF_000969885 0.25 0.15 98.502583
51 GCA_000969925 GCF_000969885 0.25 0.20 98.082729
52 GCA_000969925 GCF_000969885 0.25 0.25 98.075705
53 GCA_000969925 GCF_000969885 0.25 0.30 98.105842
54 GCA_000969925 GCF_000969885 0.25 0.35 98.145436
55 GCA_000969925 GCF_000969885 0.25 0.40 98.605504
56 GCA_000969925 GCF_000969885 0.25 0.45 98.638808
57 GCA_000969925 GCF_000969885 0.25 0.50 98.548068
58 GCA_000969925 GCF_000969885 0.25 0.55 98.717488
59 GCA_000969925 GCF_000969885 0.25 0.60 98.949116
... ... ... ... ...
我想创建一个矩阵,以0.05到0.05的步长指示从0.05到1.00的百分比,并分配其ANI值> 95的基因组百分比的值。
我尝试用两个循环进行迭代,如下所示:
import pandas as pd
import numpy as np
results = pd.read_csv(...)
matrix = np.zeros((20, 20))
m_i=0
m_j=0
for i in np.arange(0.05, 1.01, 0.05):
for j in np.arange(0.05, 1.01, 0.05):
#print(i, j)
df=results[(results.percent1 == i) & (results.percent2 == j)]
if not df.empty:
above=len(df.genome1[df.ANI > 95].unique())
total=len(df.genome1.unique())
per_above_95=above/total*100
matrix[i][j]=per_above_95
m_j += 1
else:
continue
m_i += 1
print(matrix)
不幸的是,我得到的矩阵的前2x2个值为100,其余为0。
我想要一个20x20的矩阵,如下所示:
0.05 0.10 0.15 ...
0.05 0 10 25
0.10 10 33 45
0.15 25 45 66
...