Python安排列表并将其写回文件

时间:2016-04-11 15:43:53

标签: python

我正在尝试制作一个脚本,用于查找文本文档中符号{}之间的所有内容。它需要{}中的.txt文档特定部分,并按字母顺序对其进行组织,然后将其原封写回文本文档。文本文档示例..

bla bla bla 
bla ba bl bla ba bl {apple:banana, this: something else, airplane:hobby}
bla bla bla 
bla bla bla 

所需的输出(按字母顺序排序)..

bla bla bla 
bla ba bl bla ba bl {airplane:hobby, apple:banana, this: something else}
bla bla bla 
bla bla bla 

它还在打印什么..

    bla bla bla 
    bla ba bl bla ba bl {apple:banana, this: something else, airplane:hobby}
    bla bla bla 
    bla bla bla 

我的代码..

def openFind():
    f = open(inFile, 'r')
    lines = f.read()
    match = re.findall(r'{(.*?)}', lines)
    before = str(match)
    n=1
    for i in xrange(0, len(match), n):
        mydict =  match[i:i+n]
        for x in sorted(mydict):
            c = x.split(',')
            newmatch = sorted(c)
            final =  str(newmatch)
            print final

            # NOT WORKING BELOW!!! Stuck in loop?
            with open(outFile,'w') as new_file:
                with open(inFile) as old_file:
                    for line in old_file:
                        new_file.write(line.replace(before, after))

它将排序/字母顺序列表打印为[airplane:hobby,apple:banana,this:something else],但如何让它替换文本文档中的原始文本?必须到位,但可以制作新的文本。

4 个答案:

答案 0 :(得分:2)

这应该有效:

import re

def openFind():
    with open("test.txt", "r") as in_file:
        data = in_file.read()

    def sub(m):
        l = [s.strip() for s in m.group(1).split(",")]
        l.sort()
        return "{%s}" % (", ".join(l),)

    replacement = re.sub(r'{(.*?)}', sub, data)
    with open("out.txt", "w") as out_file:
        out_file.write(replacement)

我已使用re.sub()来替换已排序的匹配。

答案 1 :(得分:1)

以下代码会在{&之间对项目进行排序}并将结果写入同一文件:

import re

with open('test.txt', 'r+') as f:
    s = f.read()
    r = list(s)
    for mo in re.finditer('{(.*?)}', s):
        d = sorted(mo.group(1).split(', '))
        r[mo.start(1):mo.end(1)] = list(', '.join(d))

    f.seek(0)
    f.write(''.join(r))

答案 2 :(得分:1)

我会在片断中解决这个问题。首先,您希望能够从一个文件中读取并写入新文件。你可以通过多种方式做到这一点。如果您的文件很小,您可以使用readlines(),截断原始文件,然后将其写回。

但我会假设巨大文件的可能性(即大于容易适合RAM /交换空间的文件。目前大小为几GB)。

import os
import tempfile

with tempfile.NamedTemporaryFile(delete=False) as temp:
    with open(filename) as infile:
        for line in infile:
            temp.write(line)
    os.unlink(infile)
    os.rename(temp.name, infile.name)

现在我们正在阅读每一行并将其写入目的地。现在您需要做的就是拦截线并在必要时进行更改:

 for line in infile:
     match = re.search('{{.*?}}')
     if match:
          # Assumes you only have one "dictionary" per line
          first_part, rest = line.split('{', maxsplit=1)
          # allows for trailing data
          data, last_part = rest.split('}', maxsplit=1)
          data = [_.split(':') for _ in data.split(',')]
          data.sort()
          line = '{}{{{}}}{}'.format(first_part, ', '.join(':'.join(_) for _ in data))
     temp.write(line)

您可能需要使用确切的算法进行调整,但这是我在遇到类似问题时会采取的方法。

答案 3 :(得分:1)

整个程序可以简洁地写成如下,

with open("file.txt") as fr:
    content = fr.read()

matches = (match.group(1) for match in re.finditer(r"{(.*?)}", content))
for match in matches:
    repl = ", ".join(sorted(match.split(", ")))
    content = content.replace(match, repl)

with open("file.txt", "w") as f:
    fw.write(content)