Question

我有一个这样的列表，其中每个元素的字符串中的第一个数字正好是每个元素的索引：

list = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

我想为元素列表中的每个单词返回（单词）所在的“list of element”的索引：

for x in list:
    ....

我的意思是这样的：

position_of_word_in_all_elements_list = set("make": 1,2,3,4,5,6,7,8,9,10,11,12)    

position_of_word_in_all_elements_list = set("your": 1,5,9)

position_of_word_in_all_elements_list = set("giulio":4,8,12)

有什么建议吗？

Answer 1

这会在输入中找到所有字符串的出现，即使是“1-”等。但是从结果中过滤掉你不喜欢的记录应该不是很重要：

# find the set of all words (sequences separated by a space) in input
s = set(" ".join(list).split(" "))

# for each word go through input and add index to the 
# list if word is in the element. output list into a dict with
# the word as a key
res = dict((key, [ i for i, value in enumerate(list) if key in value.split(" ")]) for key in s)

{''：[0]，'和'：[2,6,10]，'8-'：[8]，'11 - '：[11]，'6-'：[6]， '某事'：[2,3,6,7,10,11]，'你的'：[1,5,9]，'发生'：[3,7,11]，'giulio'：[4,8 ，12，'make'：[1,2,3,4,5,6,7,8,9,10,11,12]，'4'：[4]，'2-'：[2 ]，'他的'：[4,8,12]，'9-'：[9]，'10 - '：[10]，'7-'：[7]，'12 - '：[12]， 'take'：[4,8,12]，'put'：[2,6,10]，'choice'：[1,4,5,8,9,12]，'5-'：[5] ，'so'：[4,8,12]，'3-'：[3]，'1-'：[1]}

Answer 2

首先重命名列表，不要干扰Python内置的东西所以

>>> from collections import defaultdict
>>> li = [" ","1- make your choice", "2- put something and make", "3- make something happens", "4- giulio took his choice so make","5- make your choice", "6- put something and make", "7- make something happens", "8- giulio took his choice so make","9- make your choice", "10- put something and make", "11- make something happens", "12- giulio took his choice so make"]`
>>> dd = defaultdict(list)
>>> for l in li:
        try: # this is ugly hack to skip the " " value
            index,words = l.split('-')
        except ValueError:
            continue
        word_list = words.strip().split()
        for word in word_list:
            dd[word].append(index)
>>> dd['make']
['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']

defaultdict的作用：只要字典中存在键（我们的例子中的单词），它就像普通字典一样工作。如果密钥不存在，它会创建它，其值对应于（在我们的例子中为空列表），在您声明它dd = defaultdict(list)时指定。我不是最好的解释器，所以我建议在其他地方读取违约，如果不清楚的话:)

Answer 3

@Oleg写了一个很棒的书呆子解决方案。我想出了以下这个问题的简单方法。

def findIndex(st, lis):
    positions = []
    j = 0
    for x in lis:
        if st in x: 
            positions.append(j)
            j += 1
    return positions

$＆GT;＆GT;＆GT; findIndex（＆＃39;你的＆＃39;，列表）

[1,5,9]

Answer 4

我需要使用字符串上的数字来获取ID，为此我有解决方案......但是你记得我必须得到元素中每个单词的所有ID。

lst = [" ","1- make your choice", "2- put something and make", "3- make something happens", 
"4- giulio took his choice so make","5- make your choice", "6- put something and make", 
"7- make something happens", "8- giulio took his choice so make","9- make your choice", 
"10- put something and make", "11- make something happens", "12- giulio took his choice so make"]

diczio = {} 
abc = " ".join(lst).split(" ")

for x in lst:
    element = x

    for t in abc:
        if len(element) > 0:
            if t in element:
                xs = element.find("-")
                aw = element[0:xs]
                aw = int(aw)
                wer = set()
                wer.add(aw)
                diczio[t] = [wer]
print diczio

问题是我只得到了所有单词的一个ID而我把它们放在一组中（我的意思是wer = set（））但我需要所有单词ID：

1 - 例如，对于单词'your'i，只获取该单词所在的最后一个帖子的ID：

'your': [set(['9'])]

但我需要：

'your': [set([1,5,9])]

2- ID 9是set中的一个字符串，我需要它在int中，但如果我尝试将aw放入int中，我会收到错误：

aw = int(aw)

错误

ValueError: invalid literal for int() with base 10: ''

有什么建议吗？

为元素列表中的每个单词返回（单词）所在的“list of element”的索引

4 个答案: