使用python从两个文件中创建存在/不存在矩阵

时间:2015-08-14 17:31:37

标签: python

我有以下两个文件:

(A):

A01
A02
A03
A10
A11
C03
C10
C11
E01
E10
E11
H01
H02
H10
H11
Y09
Y10
Y11

(B):

E01  Y09  A02
A01  A03
C03  H01  H02
E10
Y10
A10
C10
H10
E11  A11  C11  H11  Y11

我试图从这些数据中创建一个存在/不存在矩阵,以查看(a)中的值是否存在于(b)中的行中。如果他们在场,那么他们应该由" 1"如果没有,它们应该由" 0"表示,其中" 0"和" 1"指标遵循(a)中的值序列。

我的预期输出是:

0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1

我尝试了以下内容:

text_file = open("Table", "w")
a = file("list", "r")
b = file("cluster", "r")

for line in a:
    words = line.split("\n")
for line in b:
    words = line.split("\t")


for line in a:
    if words in a == words in b:
        print("1")
    elif words in a != words in b:
        print("0")
text_file.close

然而,这不会打印任何内容。

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:2)

我想我明白你的意思。

final_matrix = []
a = file("list", "r")
a_list = []
# Make a list of all strings in the first file.
for line in a:
    a_list.append(line.rstrip())

b = file("cluster", "r")
for line in b:
    L1 = line.split('\t')
    # Make a presence/absence row for each line in the second file.
    this_row = [1 if i in L1 else 0 for i in a_list]
    final_matrix.append(this_row)

for row in final_matrix:
    print row
    # You can get fancier with this because right now it will
    # Print these out as lists.

在这种情况下,最终矩阵将保存为列表列表。