处理文件中的多行和/或单行的函数

时间:2010-11-20 21:32:04

标签: python file twitter

如果我有一个文件,我应该如何实现一个函数,以便它可以读取单行和多行。例如:

TimC
Tim Cxe
USA
http://www.TimTimTim.com
TimTim facebook!
ENDBIO
Charles
Dwight
END
Mcdon
Mcdonald 
Africa
      # website in here is empty, but we still need to consider it
      # bio in here is empty, but we need to include this in the dict
      # bio can be multiple lines
ENDBIO
Moon
King
END
etc

我只是想知道是否有人可以使用一些python初学者关键字(比如不要使用yield,break,continue)。

在我自己的版本中,我实际上定义了4个函数。 4个函数中有3个是辅助函数。

我想要一个函数返回:

dict = {'TimC':{'name':Tim Cxd, 'location':'USA', 'Web':'http://www.TimTimTim.com', 'bio':'TimTim facebook!','follows': ['Charles','Dwight']}, 'Mcdon':{'name':Mcdonald , 'location':'Africa', 'Web':'', 'bio':'','follows': ['Moon','King']}}

3 个答案:

答案 0 :(得分:1)

from itertools import izip

line_meanings = ("name", "location", "web")
result = {}
user = None

def readClean(iterable, sentinel=None):
    for line in iterable:
        line = line.strip()
        if line == sentinel:
            break
        yield line

while True:
    line = yourfile.readline()
    if not line:
        break
    line = line.strip()
    if not line:
        continue
    user = result[line] = {}
    user.update(izip(line_meanings, readClean(yourfile)))
    user['bio'] = list(readClean(yourfile, 'ENDBIO'))
    user['follows'] = set(readClean(yourfile, 'END'))

print result

{'Mcdon': {'bio': [''],
           'follows': set(['King', 'Moon']),
           'location': 'Africa',
           'name': 'Mcdonald',
           'web': ''},
 'TimC': {'bio': ['TimTim facebook!'],
          'follows': set(['Charles', 'Dwight']),
          'location': 'USA',
          'name': 'Tim Cxe',
          'web': 'http://www.TimTimTim.com'}}

答案 1 :(得分:0)

遍历收集各种数据的文件,并在到达适当的标记时生成。

答案 2 :(得分:0)

import sys

def bio_gen(it, sentinel="END"):
    def read_line():
        return next(it).partition("#")[0].strip() 

    while True:
        key = read_line()
        ret = {
            'name': read_line(),
            'location': read_line(),
            'website': read_line(),
            'bio': read_line(),
            'follows': []}
        next(it)                    #skip the ENDBIO line
        while True:
            line = read_line()
            if line == sentinel:
                yield key, ret
                break
            ret['follows'].append(line)

all_bios = dict(bio_gen(sys.stdin))
import pprint
pprint.pprint(all_bios)

{'Mcdon': {'bio': '',
           'follows': ['Moon', 'King'],
           'location': 'Africa',
           'name': 'Mcdonald',
           'website': ''},
 'TimC': {'bio': 'TimTim facebook!',
          'follows': ['Charles', 'Dwight'],
          'location': 'USA',
          'name': 'Tim Cxe',
          'website': 'http://www.TimTimTim.com'}}