Python在索引

时间:2015-10-09 20:22:40

标签: python list dictionary text-parsing

我是python的新手,但我想知道如何处理这个问题。我想复制索引4和20,20和25之间的所有行,并将其作为新字典中的值。

def cutting(my_sequence):    
    code={}
    text=dict(my_sequence) #converts my sequence which has line numbers as key and line as value
    list=[4,20,25] #holds line numbers I want to cut between
    #now what?Here is where I have to find out how to make a new dict with all the lines in between as value
    return code 

例如,

如果文字采用类似

的形式
{0:'hello guys this is the start\n',
 1:'this is the first line\n',
 2:'this is the second line\n'}

我想要输出字典代码:

{0:'hello guys this is the start\n this is the first line\n',
 1:'this is the second line\n'}

1 个答案:

答案 0 :(得分:2)

这里似乎字典是错误的选择。让我们改用列表。由于我们忽略了原始行号,我们可以从列表中的位置推断出它们。

def cutting(my_sequence: "list of tuples of form: (int, str)"): -> list
    flat_lst = [v for _, v in my_sequence]

这会构建一个JUST文本列表。现在让我们构建一个与

一起使用的范围列表
    lines_to_join = [5, 20, 25]
    ranges = [range(lines_to_join[i],
                    lines_to_join[i+1]) for i in range(len(lines_to_join)-1)]
    # ranges is now [range(5, 20), range(20, 25)]

有更好的方法可以做到这一点(请参阅itertools recipes中的pairwise函数),但这适用于这个小应用程序

接下来,让我们使用"\n".join将您想要的行粘在一起。

    result = ["\n".join([flat_lst[idx] for idx in r]) for r in ranges]
    # you might want to strip the natural newlines out of the values, so
    # # result = ["\n".join([flat_lst[idx].strip() for idx in r]) ...]
    # I'll leave that for you
    return result

请注意,如果IndexError中的任何索引超出ranges,则会抛出flat_lst

我们应该像以下一样:

def cutting(my_sequence: "list of tuples of form: (int, str)"): -> list
    flat_lst = [v for _, v in my_sequence]lines_to_join = [5, 20, 25]
    ranges = [range(lines_to_join[i],
                    lines_to_join[i+1]) for i in range(len(lines_to_join)-1)]
    # ranges is now [range(5, 20), range(20, 25)]

    result = ["\n".join([flat_lst[idx] for idx in r]) for r in ranges]
    return result