复制列而不重复

时间:2013-05-13 21:12:14

标签: python

我需要访问.txt文件,它有2列和许多重复名称的行(使用Python)。 我只想复制其中一列而不重复其中的名称,将其打印在新的.txt文件中。我试过了:

g = open(file,'r')
linesg = g.readlines()
h = open(file,'w+')
linesh = h.readlines()
for line in range(len(linesg)):
     if linesg[line] in linesh:
        line += 1
     else:
        h.write(linesg[line].split('\t')[1])

但我继续在.txt文件上重复名称。谁能帮助我? (是的,我是Python编程的新手)。 非常感谢!

2 个答案:

答案 0 :(得分:0)

g = open(file,'r')
names = {}
for line in g.readlines():
    name = line.split('\t')[1] #Name is in the second tab
    names[name] = 1 #create a dictionary with the names

#names.keys() returns a list of all the names here
# change the file handle here if needed, or the original file would be overwritten. 
h = open(file,'w+')
for name in names.keys():
    h.write("%s\n"%name)

答案 1 :(得分:0)

sep = '\t'
lines = open('in_file.txt').readlines()
lines_out = []
for line in lines:
    line = line.strip()
    parts = line.split(sep)
    line_out = "%s\n" %(parts[0],) # if only the first column is copied
    if line_out not in lines_out:
        lines_out.append(line_out)

h = open('out_file.txt','w')
h.writelines(lines_out)
h.close()

将其更改为部分[1]以复制第2列,..