程序跳过在Python中阅读文本

时间:2014-11-25 21:59:50

标签: python

def read_poetry_form_description(poetry_forms_file):
    """ (file open for reading) -> poetry pattern

    Precondition: we have just read a poetry form name from poetry_forms_file.

    Return the next poetry pattern from poetry_forms_file.
    """
    # Create three empty lists
    syllables_list = []
    rhyme_list = []
    pattern_list = []
    # Read the first line of the pattern
    line = poetry_forms_file.readline()
    # Read until the end the the pattern
    while line != '\n' and line != '':
        # Clean the \n's
        pattern_list.append(line.replace('\n', '').split(' '))
        line = poetry_forms_file.readline()
    # Add elements to lists
    for i in pattern_list:
        syllables_list.append(int(i[0]))
        rhyme_list.append(i[1])
    # Add two lists into a tuple
    pattern = (syllables_list, rhyme_list)
    return pattern

def read_poetry_form_descriptions(poetry_forms_file):
    """ (file open for reading) -> dict of {str: poetry pattern}

    Return a dictionary of poetry form name to poetry pattern for the
    poetry forms in poetry_forms_file.
    """
    # Initiate variables
    forms_dict = {}
    keys = []
    values = []
    # Get the first form
    line = poetry_forms_file.readline()
    # Add the name to the keys list
    keys.append(line.replace('\n', ''))
    # Add the variable to the values list using the previous function
    values.append(read_poetry_form_description(poetry_forms_file))
    while line != '':
        # Check if the line is the beginning of a form
        if line == '\n':
            line = poetry_forms_file.readline()
            keys.append(line.replace('\n', ''))
            values.append(read_poetry_form_description(poetry_forms_file))
        else:
            line = poetry_forms_file.readline()
    # Add key-value pairs to the dictionary
    for i in range(len(keys)):
        forms_dict[keys[i]] = values[i]
    return forms_dict

因此,当我尝试使用文本文件测试代码时,会出现问题。 它返回以下内容: read_poetry_form_descriptions(开( 'poetry_forms.txt'))

{'Limerick':( [8,8,5,5,8],['A','A','B','B','A']),'Rondeau':( [ 8,8,8,8,8,8,8,8,4,8,8,8,8,4,4],['A','A','B','B','A ','A','A','B','C','A','A','B','B','A','C']),'Haiku':( [ 5,7,5],['','','*'])}}

应该有另外两个键值对。 这就是文本文件中的内容:

Haiku
5 *
7 *
5 *

Sonnet
10 A
10 B
10 A
10 B
10 C
10 D
10 C
10 D
10 E
10 F
10 E
10 F
10 G
10 G

Limerick
8 A
8 A
5 B
5 B
8 A

Quintain (English)
0 A
0 B
0 A
0 B
0 B

Rondeau
8 A
8 A
8 B
8 B
8 A
8 A
8 A
8 B
4 C
8 A
8 A
8 B
8 B
8 A
4 C

3 个答案:

答案 0 :(得分:0)

问题是因为你似乎正在治疗" line"作为一个全球变量,但它不是全球性的。你可以轻松地修复"通过使其全球化;然而,这是一种可怕的做法。

修改 我已经更新了你的代码,没有全局变量。问题是当您从文件中读取时,局部变量行不会自动同步,因此在一个函数中读取的最后一行不会更新另一个函数中的行变量。另外,看一下像split和strip这样的字符串处理。

def read_poetry_form_description(poetry_forms_file):
    """ (file open for reading) -> poetry pattern

    Precondition: we have just read a poetry form name from poetry_forms_file.

    Return the next poetry pattern from poetry_forms_file.
    """
    # Create three empty lists
    syllables_list = []
    rhyme_list = []
    pattern_list = []
    # Read the first line of the pattern
    line = poetry_forms_file.readline()
    # Read until the end the the pattern
    while line != '\n' and line != '':
        # Clean the \n's
        pattern_list.append(line.replace('\n', '').split(' '))
        line = poetry_forms_file.readline()
    # Add elements to lists
    for i in pattern_list:
        syllables_list.append(int(i[0]))
        rhyme_list.append(i[1])
    # Add two lists into a tuple
    pattern = (syllables_list, rhyme_list)
    return pattern

def read_poetry_form_descriptions(poetry_forms_file):
    """ (file open for reading) -> dict of {str: poetry pattern}

    Return a dictionary of poetry form name to poetry pattern for the
    poetry forms in poetry_forms_file.
    """
    # Initiate variables
    forms_dict = {}
    keys = []
    values = []
    # Get the first line
    line = poetry_forms_file.readline()
    while line != '':
        # Check if the line is the beginning of a form
        if line != '\n':
            keys.append(line.replace('\n', ''))
            values.append(read_poetry_form_description(poetry_forms_file))
        line = poetry_forms_file.readline()
    # Add key-value pairs to the dictionary
    for i in range(len(keys)):
        forms_dict[keys[i]] = values[i]
    return forms_dict

答案 1 :(得分:0)

问题在于read_poetry_form_descriptions'\n'识别为表单描述的开头。但read_poetry_form_description也使用'\n'来识别表单描述的结尾。因此,当它将控制权传递回read_poetry_form_descriptions时,空行已经被读取。

有各种方法可以解决这个问题,但实际上我觉得在单个函数中重新组织和简化代码更加清晰:

def read_poetry_form_descriptions(poetry_forms_file):
    forms = {}
    title = None
    for line in poetry_forms_file:
        if line == '\n':
            forms[title] = syllables, rhymes
            title = None
        elif title == None:
            title = line.strip()
            syllables = []
            rhymes = []
        else:
            syllable, rhyme = line.strip().split()
            syllables.append(syllable)
            rhymes.append(rhyme)
    return forms

编辑:如果您在评论中说,您必须保留这两个功能,那么您可以按如下方式更改第二个功能。

def read_poetry_form_descriptions(poetry_forms_file):
    forms = {}
    while True:
        line = poetry_forms_file.readline()
        if line == '':
            return forms
        forms[line.strip()] = read_poetry_form_description(poetry_forms_file)

此函数不应检查'\n',因为另一个函数正在处理这个问题。

答案 2 :(得分:0)

我有一个两层的解决方案就像你的代码一样,并且与你的代码相比非常简单......我也非常高兴打印摘要的代码在工作结束时,看看它,享受一点点变态,一个干净,理性主义的编程语言允许你偶尔...好吧,这是我的代码,只是一个字,我缩短了变量名称,懒惰的遗漏评论等...

def get_poetry(f):
    d = {}
    while 1:
        l = f.readline()
        if l == '': break # end of file
        name = l.strip()
        d[name] = get_sr(f)
    return d

def get_sr(f):
    s = [] ; r = []
    while 1:
        l = f.readline()
        if l == '\n' or l == '': return s, r
        s_i, r_i = l.strip().split()
        s.append(s_i) ; r.append(r_i)

d = get_poetry(open('poetry.txt')

print '\n\n'.join(['\n'.join([
    name,
    "    syllables: "+" ".join(["%2s"%(count,) for count in sr[0]]),
    "       rhymes: "+" ".join(["%2s"%(c,) for c in sr[1]])])
                   for name, sr in d.items()])

将上述内容放入文件中并执行

Limerick
    syllables:  8  8  5  5  8
       rhymes:  A  A  B  B  A

Sonnet
    syllables: 10 10 10 10 10 10 10 10 10 10 10 10 10 10
       rhymes:  A  B  A  B  C  D  C  D  E  F  E  F  G  G

Quintain (English)
    syllables:  0  0  0  0  0
       rhymes:  A  B  A  B  B

Rondeau
    syllables:  8  8  8  8  8  8  8  8  4  8  8  8  8  8  4
       rhymes:  A  A  B  B  A  A  A  B  C  A  A  B  B  A  C

Haiku
    syllables:  5  7  5
       rhymes:  *  *  *