从大小递增的嵌套列表中获取对元素

时间:2018-12-04 14:46:12

标签: python list nested nlp corpus

可能是我正在尝试使用不适合我需要的数据结构,但是……鉴于此:

import itertools

listOfFileData = [['[', 'Emma', 'by', 'Jane', 'Austen'] ,['[', 'Persuasion', 'by', 'Jane', 'Austen'] ,['[', 'Sense', 'and', 'Sensibility', 'by'] ,
['[', 'The', 'King', 'James', 'Bible'] ,['[', 'Poems', 'by', 'William', 'Blake'] ,['[', 'Stories', 'to', 'Tell', 'to'] ,
['[', 'The', 'Adventures', 'of', 'Buster'] ,['[', 'Alice', "'", 's', 'Adventures'] ,
['[', 'The', 'Ball', 'and', 'The'] ,['[', 'The', 'Wisdom', 'of', 'Father'] ,['[', 'The', 'Man', 'Who', 'Was'] ,
['[', 'The', 'Parent', "'", 's'] ,['[', 'Moby', 'Dick', 'by', 'Herman'] ,['[', 'Paradise', 'Lost', 'by', 'John'] ,
['[', 'The', 'Tragedie', 'of', 'Julius'] ,['[', 'The', 'Tragedie', 'of', 'Hamlet'] ,['[', 'The', 'Tragedie', 'of', 'Macbeth'] ,
['[', 'Leaves', 'of', 'Grass', 'by'] ]

#print(len(listOfFileData)) # should show 18 files, each is a list of tokens. 

filesDataPairsList = list(itertools.combinations(listOfFileData, 2)) # requires itertools library file(s)

filesDataPairsListTesting = []
for i in range(2,19,2): # 2,4,6,8,...18
    combinationOfPairsList = list(itertools.combinations(listOfFileData[:i], 2)) # make a list, of increasingly sized pairs
    filesDataPairsListTesting.append(combinationOfPairsList)

#print(len(filesDataPairsListTesting)) # should have 9 lists
#print(len(filesDataPairsListTesting[8])) # should have 153 pairs

我如何在循环中到达每一对?我一直在努力解决以下问题。但是我没到那儿。

for permutations in filesDataPairsListTesting:
#     print(len(permutations)) # if uncommented should read, 1,6,15,28....153
    for numOfPairs in range(len(permutations)):
        for pair in permutations:
            permutations[0]
            permutations[1]

我想访问每个列表对[[],[]],以能够处理for块中每个对中的每个文档。

所以我的filesDataPairsListTesting列表中的元素为0。我可以很容易地进入每个项目,例如

        permutations[0]
        permutations[1]

但是第二元素有6对...?因此,我必须遍历元素1 6次(如何?),以便可以进入permutations [0],permutations [1]。正是这部分使我感到困惑。

1 个答案:

答案 0 :(得分:0)

您基本上只有嵌套列表。只需循环两者。例如:

ls = [
    [
        ([1, 2, 3], ["a", "b", "c"]),
        ([1, 2, 3], ["d", "e", "f"])
    ],
    [
        ([4, 5, 6], ["a", "b", "c"]),
        ([4, 5, 6], ["d", "e", "f"]),
        ([4, 5, 6], ["g", "h", "i"])
    ]
]

for pairs in ls:
    for pair in pairs:
        print(pair)