我有以下两个文件:
(A):
A01
A02
A03
A10
A11
C03
C10
C11
E01
E10
E11
H01
H02
H10
H11
Y09
Y10
Y11
和
(B):
E01 Y09 A02
A01 A03
C03 H01 H02
E10
Y10
A10
C10
H10
E11 A11 C11 H11 Y11
我试图从这些数据中创建一个存在/不存在矩阵,以查看(a)中的值是否存在于(b)中的行中。如果他们在场,那么他们应该由" 1"如果没有,它们应该由" 0"表示,其中" 0"和" 1"指标遵循(a)中的值序列。
我的预期输出是:
0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0
1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1
我尝试了以下内容:
text_file = open("Table", "w")
a = file("list", "r")
b = file("cluster", "r")
for line in a:
words = line.split("\n")
for line in b:
words = line.split("\t")
for line in a:
if words in a == words in b:
print("1")
elif words in a != words in b:
print("0")
text_file.close
然而,这不会打印任何内容。
有人可以帮忙吗?
答案 0 :(得分:2)
我想我明白你的意思。
final_matrix = []
a = file("list", "r")
a_list = []
# Make a list of all strings in the first file.
for line in a:
a_list.append(line.rstrip())
b = file("cluster", "r")
for line in b:
L1 = line.split('\t')
# Make a presence/absence row for each line in the second file.
this_row = [1 if i in L1 else 0 for i in a_list]
final_matrix.append(this_row)
for row in final_matrix:
print row
# You can get fancier with this because right now it will
# Print these out as lists.
在这种情况下,最终矩阵将保存为列表列表。