我想比较两个文件的内容,然后得到一个矩阵,其中匹配得分为“1”,没有匹配得分为“0”。例如,file1.txt中的aer23用于搜索file2.txt中的所有元素,并且列匹配记录匹配/不匹配。因此在输出中,file1.txt的内容成为行,file2.txt的内容成为列
FILE1.TXT:
aer23
aub1
fer4
qty1
sap89
xty32
FILE2.TXT:
fer4
xty32
aer23
aub1
sap89
qty1
输出:
fer4 xty32 aer23 aub1 sap89 qty1
aer23 0 0 1 0 0 0
aub1 0 0 0 1 0 0
fer4 1 0 0 0 0 0
qty1 0 0 0 0 0 1
sap89 0 0 0 0 1 0
xty32 0 1 0 0 0 0
我的代码:
outfile=open("out.txt","w")
record=[]
for line in open("file2.txt","r"):
record.append(line)
for line in open("file2.txt","r"):
if line==iter(record):
outfile.write("1","\t")
else:
outfile.write("0","\t")
next
如何使此代码生效?感谢
答案 0 :(得分:1)
我想你想要做的是:
outfile=open("out.txt","w")
# First you need to write the header row
outfile.write("\t")
for line2 in open("file2.txt","r"):
outfile.write(line2.strip() + "\t")
outfile.write("\n")
# You never do anything useful with record, so don't build it
#record=[]
# Open file1 and file2, not file2 and file2, and don't reuse the name line
for line1 in open("file1.txt","r"):
# You need also need to write the header column
outfile.write(line1.strip() + "\t")
#record.append(line)
for line2 in open("file2.txt","r"):
# Don't try to compare the string to a list iterator, compare it
# to the string from the other file.
if line1==line2:
# You can't pass write multiple arguments like print, just
# put the two strings together
outfile.write("1\t")
else:
# Indentation matters in Python
outfile.write("0\t")
# next is a function that gets the next value from an iterator;
# just referring to that function by name doesn't do anything
#next
# Don't forget to end each line
outfile.write("\n")
# You should always close files, but _especially_ writable files
outfile.close()
这可以改进很多,但这应该是最简单的一组变化,让你接近你想要的地方。
不是向您展示您可以逐一进行的所有更改,而是让我告诉您我是如何编写的,您可以在帮助中查找所有功能:
import csv
with open('file2.txt') as file2:
columns = [line.strip() for line in file2]
with open('file1.txt') as file1, open('out.txt', 'w') as outfile:
writer = csv.writer(outfile, delimiter='\t')
writer.writerow([''] + columns)
for line in file1:
row = line.strip()
writer.writerow([row] + [1 if row==column else 0 for column in columns])