所以我有两个文件file1
和file2
,大小不等,每行至少有一百万return separated
行。我希望将file1
中的内容与file2
匹配,如果匹配,请从file1
中删除相同内容。例如:
+------------+-----------+--------------------------+
| file1 | file2 | after processing - file1 |
+------------+-----------+--------------------------+
| google.com | in.com | google.com |
+------------+-----------+--------------------------+
| apple.com | quora.com | apple.com |
+------------+-----------+--------------------------+
| me.com | apple.com | |
+------------+-----------+--------------------------+
我的代码看起来就像。
with open(file2) as fin:
exclude = set(line.rstrip() for line in fin)
for line in fileinput.input(file1, inplace=True):
if line.rstrip() not in exclude:
print
line,
只删除file1
的所有内容。我该如何解决这个问题?
感谢。
答案 0 :(得分:2)
您的print
语句及其参数位于不同的行。请改为print line,
。
答案 1 :(得分:0)
如果工作记忆不是问题,我建议一个粗略的解决方案 - 加载file1
然后迭代import os
import shutil
FILE1 = "file1" # path to file1
FILE2 = "file2" # path to file2
# first load up FILE2 in the memory
with open(FILE2, "r") as f: # open FILE2 for reading
file2_lines = {line.rstrip() for line in f} # use a set for FILE2 for fast matching
# open FILE1 for reading and a FILE1.tmp file for writing
with open(FILE1, "r") as f_in, open(FILE1 + ".tmp", "w") as f_out:
for line in f_in: # loop through the FILE1 lines
if line.rstrip() in file2_lines: # match found, write to a temporary file
f_out.write(line)
# finally, overwrite the FILE1 with temporary FILE1.tmp
os.remove(FILE1)
shutil.move(FILE1 + ".tmp", FILE1)
写下匹配的行:
def read_value(encoded_command):
s.write(encoded_command)
temp = ''
response = ''
while '\r' not in response:
response = s.read().decode()
temp = temp + response
return temp
编辑:显然,fileinput.input()
做的几乎一样,所以你的问题确实是一个错字。哦,好吧,为后代留下答案,因为这可以让你更好地控制整个过程。