如何在python中合并具有相同第一个单词的文件行?

时间:2014-06-09 15:21:33

标签: python debugging computer-science

我编写了一个程序来合并包含相同第一个单词的文件中的行 在python中。但是我无法获得所需的输出。 谁能告诉我程序中的错误?

注意: - (第1行,第2行)(第4行,第5行,第6行)正在合并,因为它们 拥有相同的第一个元素

#input
"file.txt"
line1: a b c
line2: a b1 c1
line3: d e f
line4: i j k
line5: i s t 
line6: i m n 

#output
a b c a b1 c1 
d e f
i j k i s t i m n 

#my code
for i in range(0,len(a)):
j=i
try:
    while True:
        if  a[j][0] == a[j+1][0]:
            L.append(a[j])
            L.append(a[j+1])
            j=j+2
        else:
            print a[i]
            print L
            break
except:
    pass`

1 个答案:

答案 0 :(得分:0)

试试这个(将文件作为参数提供)。

使用您期望的行生成字典。

import sys
if "__main__" == __name__:


    new_lines = dict()
    # start reading file
    with open(sys.argv[1]) as a:

        # iterate file by lines - removing newlines
        for a_line in a.read().splitlines():

            # candidate is first word in every sentence
            candidate = a_line.split()[0] # split on whitespace by default

            # dictionary keys are previous candidates
            if candidate in new_lines.keys():

                # word already included
                old_line = new_lines[candidate]
                new_lines[candidate] = "%s %s" % (old_line, a_line)

            else:

                # word not included
                new_lines[candidate] = a_line


# now we have our dictionary. print it (or do what you want with it)
for key in new_lines.keys():
    print "%s -> %s" % (key, new_lines[key])

输出:

a -> a b c a b1 c1
i -> i j k i s t i m n
d -> d e f