我正在尝试编写代码以将文件的所有副本复制到新文件中。我编写的程序检查每行的前3个元素,并将它与下一行进行比较。
f=open(r'C:\Users\xamer\Desktop\file.txt','r')
data=f.readlines()
f.close()
lines=data.copy()
dup=open(r'C:\Users\xamer\Desktop\duplicate.txt','a')
for x in data:
for y in data:
if (y[0]==x[0]) and (y[1]==x[1]) and (y[2]==x[2]):
lines.append(y)
else:
lines.remove(y)
dup.write(lines)
dup.close()
我收到以下错误:
Traceback (most recent call last):
File "C:\Users\xamer\Desktop\file.py", line 80, in <module>
lines.remove(y)
ValueError: list.remove(x): x not in list
有什么建议吗?
答案 0 :(得分:0)
这些片段应该完成您要求的工作。一开始我想创建一个duplicated_lines
列表,然后在结尾写下所有内容。但后来我意识到我可以通过动态编写重复的项目来优化代码性能,避免额外的最终循环
如另一位用户所强调的那样,您是否只想从位置独立检查相邻的双项或重复项目并不是很清楚
在第一种情况下 - 紧接着重复 - 这是代码:
# opening the source file
with open('hello.txt','r') as f:
# returns a list containing the original lines
data=f.readlines()
# creating the file to host the repeated lines
with open('duplicated.txt','a') as f:
for i in range(0, len(data)-1):
# stripping to avoid a bug if the last line is a repeated item
if(data[i].strip('\n') == data[i+1].strip('\n')):
print("Lines {}: {}".format(i, data[i]))
print("Lines {}: {}".format(i+1, data[i+1]))
#duplicated_lines.append(data[i])
print("Line repeated: " + data[i])
f.write("%s\n" % data[i])
如果您想要检查文件中的重复行,那么这就是代码:
# opening the source file
with open('hello.txt','r') as f:
# returns a list containing the original lines
data=f.readlines()
# creating the file to host the repeated lines
with open('duplicated.txt','a') as f:
for i in range(0, len(data)-1):
for j in range(i+1, len(data)):
# stripping to avoid a bug if the last line is a repeated item
if(data[i].strip('\n') == data[j].strip('\n')):
print("Lines {}: {}".format(i, data[i]))
print("Lines {}: {}".format(j, data[j]))
#duplicated_lines.append(data[i])
print("Line repeated: " + data[i])
f.write("%s\n" % data[i])