我想比较两个文本文件,每个文件有三列。一个文件有999行,另一个有757行。我希望将不同的242行存储在不同的文件中。我使用随机网络生成器创建了第一个文件(999行)(999行是边,第三列是第一列,第二列之间的权重 - 源节点,目标节点)。
文件格式 - 文件1,2
1 3 1
16 36 1
我试过了
Compare two files line by line and generate the difference in another file 和 find difference between two text files with one item per line和http://www.daniweb.com/software-development/python/threads/124932/610058#post610058
既不适合我。
我认为这是字符串比较的问题。我想比较第一列和第二列中的数字。如果它们都不同,我想把它写到第三个文件。
非常感谢任何帮助!
更新
我发布了以下代码,我在@MK发表评论后尝试过。
f = open("results.txt","w")
for line in file("100rwsnMore.txt"):
rwsncount += 1
line = line.split()
src = line[0]
dest = line[1]
for row in file("100rwsnDeleted.txt"):
row = row.split()
s = row[0]
d = row[1]
if(s != src and d != dest):
f.write(str(s))
f.write(' ')
f.write(str(d))
f.write('\n')
f.close()
答案 0 :(得分:7)
如果您使用* nix系统,最好的通用选项只是使用:
sort filea fileb | uniq -u
但是如果你需要使用Python:
您的代码会在外部文件的每次迭代中重新打开内部文件。在循环外打开它。
使用嵌套循环的效率低于存储找到的值的第一个循环,然后将第二个值与这些值进行比较。
def build_set(filename):
# A set stores a collection of unique items. Both adding items and searching for them
# are quick, so it's perfect for this application.
found = set()
with open(filename) as f:
for line in f:
# [:2] gives us the first two elements of the list.
# Tuples, unlike lists, cannot be changed, which is a requirement for anything
# being stored in a set.
found.add(tuple(sorted(line.split()[:2])))
return found
set_more = build_set('100rwsnMore.txt')
set_del = build_set('100rwsnDeleted.txt')
with open('results.txt', 'w') as out_file:
# Using with to open files ensures that they are properly closed, even if the code
# raises an exception.
for res in (set_more - set_del):
# The - computes the elements in set_more not in set_del.
out_file.write(" ".join(res) + "\n")