我有此代码:
import sys
import argparse
import operator
def main (argv):
parser = argparse.ArgumentParser()
parser.add_argument('infile', help='file to process')
parser.add_argument('outfile', help='file to produce')
args = parser.parse_args()
with open(args.infile, "r") as f:
with open(args.outfile,"w+") as of:
seen=set()
for line in f:
line_lower = line.lower()
if line_lower not in seen:
of.write(line_lower)
else:
pass
if __name__ == "__main__":
main(sys.argv)`
文件内文件示例:
M03972:51:000000000-BJVL8:1:1103:20083:5527 猫
有时会有重复的序列。我想删除它们,但是我的代码似乎无法正常工作。它只是基本上复制文件,但不会引发任何错误。 有人知道为什么吗?
谢谢
答案 0 :(得分:1)
您忘记添加seen
独有的行。这是代码的固定部分:
seen=set()
for line in f:
line_lower = line.lower()
if line_lower not in seen:
of.write(line_lower)
else:
seen.add(line_lower)