我需要访问.txt文件,它有2列和许多重复名称的行(使用Python)。 我只想复制其中一列而不重复其中的名称,将其打印在新的.txt文件中。我试过了:
g = open(file,'r')
linesg = g.readlines()
h = open(file,'w+')
linesh = h.readlines()
for line in range(len(linesg)):
if linesg[line] in linesh:
line += 1
else:
h.write(linesg[line].split('\t')[1])
但我继续在.txt文件上重复名称。谁能帮助我? (是的,我是Python编程的新手)。 非常感谢!
答案 0 :(得分:0)
g = open(file,'r')
names = {}
for line in g.readlines():
name = line.split('\t')[1] #Name is in the second tab
names[name] = 1 #create a dictionary with the names
#names.keys() returns a list of all the names here
# change the file handle here if needed, or the original file would be overwritten.
h = open(file,'w+')
for name in names.keys():
h.write("%s\n"%name)
答案 1 :(得分:0)
sep = '\t'
lines = open('in_file.txt').readlines()
lines_out = []
for line in lines:
line = line.strip()
parts = line.split(sep)
line_out = "%s\n" %(parts[0],) # if only the first column is copied
if line_out not in lines_out:
lines_out.append(line_out)
h = open('out_file.txt','w')
h.writelines(lines_out)
h.close()
将其更改为部分[1]以复制第2列,..