Question

说我有一个字符串

"I say what I mean. I mean what I say. i do."

我正在尝试编写一个函数，它将返回一个如下所示的字典：

{'i":[0,1,2],'say':[0,1],'what':[0,1],'mean':[0,1],'do':[2]}

它所做的是，它将每个字符（仅一次）作为一个关键字输入字典，并显示它出现的句子作为与该关键字相关的值。因此，对于例如，单词＆＃34;表示＆＃34;出现在第一个[0]和第二个1句子中。另一方面，＆＃34; do＆＃34;只出现在第三句中，因此：

'do':[2]

在输出中。

这是我在更改了我想到的所有内容后得出的代码，以获取与每个键配对的值列表。

def wordsD(text):
#split each sentences at '.'
myList = text.lower().split('.')
#declare empty dictionary for the counter
myDict = {}
counterList = []
for sentence in myList:
    words = sentence.split()
    for word in words:
        index = words.index(word)
        counterList.append(index)
        if word not in myDict:
            myDict[word] = list()
            myDict[word].append(index)
        else:
            myDict[word]= list()
            myDict[word].append(index)


return myDict


text=('I say what I mean. I mean what I say. i do.')
print(wordsD(text))

这是我得到的输出：

{'mean': [1], 'what': [2], 'say': [4], 'i': [0], 'do': [1]}

但现在我不确定我是否理解错误的问题，或者我在代码中遗漏了一些内容。任何帮助都会很棒!!即使是正确方向的指针也会帮助我，因为即使我尝试为这个问题编写一个psudo代码，我也会空白。谢谢！

我查看了Counting vowels和Turning a text file with words and their positions into a sentence，但我仍然无法弄清楚如何将列表作为每个键的值。

Answer 1

这肯定会对你有所帮助。

string = "I say what I mean. I mean what I say. i do."

DICT = {}

LIST  =  string.split('.')

WORDS = list(set(string.lower().replace('.',"").split()))

LIST = [set((x.lower()).split()) for x in LIST]

for i in range(len(LIST)):
    for item in WORDS:
        if item in LIST[i]:
            DICT.setdefault(item, []).append(i)
print(DICT)

输出

{'i': [0, 1, 2], 'do': [2], 'say': [0, 1], 'what': [0, 1], 'mean': [0, 1]}

Answer 2

index现在代表句子中单词的位置，而不是句子的索引。试试这个：

for index, sentence in enumerate(myList):
 ...

Answer 3

您的代码存在两个问题。首先，您在if和else语句中创建了一个新列表，而不是附加到现有列表中。

更改

else:
    myDict[word] = list()
    myDict[word].append(index)

到

else:
    myDict[word].append(index)

解决了这个问题。

其次，您的代码正在跟踪给定句子中的索引（即单词位置）而不是它所存在的句子（您的问题表明您想要的句子）。以下代码应该解决该问题

def wordsD(text):
    myList = text.lower().split('.')
    myDict = {}

    for i in range(len(myList)):
        words = myList[i].split()
        for word in words:
            if word not in myDict:
                myDict[word] = [i]
            else:
                if i not in myDict[word]:
                    myDict[word].append(i)

    return myDict

Answer 4

def wordsD(text):
#split each sentences at '.'
    myList = text.lower().split('.')
    #declare empty dictionary for the counter
    myDict = {}
    counterList = []

# use the enumerate here
    for senten_no,sentence in enumerate(myList): 
        words = sentence.split()
        for word in words:
            index = words.index(word)
            counterList.append(index)
            if word not in myDict:
                myDict[word] = list()
                myDict[word].append(senten_no)
            else:
                if not senten_no in myDict[word]:
                    myDict[word].append(senten_no)


    return myDict


    text=('I say what I mean. I mean what I say. i do.')
print(wordsD(text))

每次你的附加索引的单词而不是句子的索引。对句子使用枚举。保留索引，同时追加附加句子索引

Answer 5

在分配index时，您的代码出错了。目前，每次迭代中的单词结构都是这样的 for

例如：

第一次迭代

词= [I，也就是说，有什么，I，意味着]

当你试图找到单词的索引时，它会返回该句子中的索引，而不是句子编号。

相反，你可以在句子级别保留一个循环计数器，而不需要找到索引只需将该计数器值分配给句子中的每个单词。

index=-1
for sentence in myList:
    words = sentence.split()
    index++
    for word in words:
        counterList.append(index)
        if word not in myDict:
            myDict[word] = list()
            myDict[word].append(index)
        else:
            myDict[word]= list()
            myDict[word].append(index)

返回一个字典，其中包含单词作为键，它在文本中的出现位置为值

5 个答案: