为元素列表中的每个单词返回(单词)所在的“list of element”的索引

时间:2015-10-25 17:23:11

标签: python

我有一个这样的列表,其中每个元素的字符串中的第一个数字正好是每个元素的索引:

list = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

我想为元素列表中的每个单词返回(单词)所在的“list of element”的索引:

for x in list:
    ....

我的意思是这样的:

position_of_word_in_all_elements_list = set("make": 1,2,3,4,5,6,7,8,9,10,11,12)    

position_of_word_in_all_elements_list = set("your": 1,5,9)

position_of_word_in_all_elements_list = set("giulio":4,8,12)

有什么建议吗?

4 个答案:

答案 0 :(得分:1)

这会在输入中找到所有字符串的出现,即使是“1-”等。但是从结果中过滤掉你不喜欢的记录应该不是很重要:

# find the set of all words (sequences separated by a space) in input
s = set(" ".join(list).split(" "))

# for each word go through input and add index to the 
# list if word is in the element. output list into a dict with
# the word as a key
res = dict((key, [ i for i, value in enumerate(list) if key in value.split(" ")]) for key in s)
  

{'':[0],'和':[2,6,10],'8-':[8],'11 - ':[11],'6-':[6], '某事':[2,3,6,7,10,11],'你的':[1,5,9],'发生':[3,7,11],'giulio':[4,8 ,12,'make':[1,2,3,4,5,6,7,8,9,10,11,12],'4':[4],'2-':[2 ],'他的':[4,8,12],'9-':[9],'10 - ':[10],'7-':[7],'12 - ':[12], 'take':[4,8,12],'put':[2,6,10],'choice':[1,4,5,8,9,12],'5-':[5] ,'so':[4,8,12],'3-':[3],'1-':[1]}

答案 1 :(得分:0)

首先重命名列表,不要干扰Python内置的东西 所以

>>> from collections import defaultdict
>>> li = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]`
>>> dd = defaultdict(list)
>>> for l in li:
        try: # this is ugly hack to skip the " " value
            index,words = l.split('-')
        except ValueError:
            continue
        word_list = words.strip().split()
        for word in word_list:
            dd[word].append(index)
>>> dd['make']
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']

defaultdict的作用: 只要字典中存在键(我们的例子中的单词),它就像普通字典一样工作。如果密钥不存在,它会创建它,其值对应于(在我们的例子中为空列表),在您声明它dd = defaultdict(list)时指定。我不是最好的解释器,所以我建议在其他地方读取违约,如果不清楚的话:)

答案 2 :(得分:0)

@Oleg写了一个很棒的书呆子解决方案。我想出了以下这个问题的简单方法。

def findIndex(st, lis):
    positions = []
    j = 0
    for x in lis:
        if st in x: 
            positions.append(j)
            j += 1
    return positions
  

$>>> findIndex('你的',列表)

     

[1,5,9]

答案 3 :(得分:0)

我需要使用字符串上的数字来获取ID,为此我有解决方案......但是你记得我必须得到元素中每个单词的所有ID。

lst = [" ","1- make your choice", "2- put something and make", "3- make something happens", 
"4- giulio took his choice so make","5- make your choice", "6- put something and make", 
"7- make something happens", "8- giulio took his choice so make","9- make your choice", 
"10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

diczio = {} 
abc = " ".join(lst).split(" ")

for x in lst:
    element = x

    for t in abc:
        if len(element) > 0:
            if t in element:
                xs = element.find("-")
                aw = element[0:xs]
                aw = int(aw)
                wer = set()
                wer.add(aw)
                diczio[t] = [wer]
print diczio

问题是我只得到了所有单词的一个ID而我把它们放在一组中(我的意思是wer = set())但我需要所有单词ID:

1 - 例如,对于单词'your'i,只获取该单词所在的最后一个帖子的ID:

'your': [set(['9'])]

但我需要:

'your': [set([1,5,9])]

2- ID 9是set中的一个字符串,我需要它在int中,但如果我尝试将aw放入int中,我会收到错误:

aw = int(aw)

错误

ValueError: invalid literal for int() with base 10: ''

有什么建议吗?