Question

所以我有一个非常长的字符串，里面有一个＆＃34; $＆＃34;标志和一部分的演讲。例如：＆＃34; $ dog -v。通常作为宠物拥有的动物＆＃34;

我想做的是拉出＆＃34; $＆＃34;之后的每一个字。并且先于＆＃34; v。＆＃34;并根据他们的词性在字典中对它们进行排序。在上面的例子中，输出应该是{＆＃34; dog＆＃34;：＆＃34; -v。＆＃34;}。通过这种方式，我将有一个字典，其中的键为＆＃34; -v。＆＃34;表示他们是动词。

我认为最好的方法是使用字符串切片和for循环，但我最好的是：

my_dict = {}
for i in words:
    if i == "$":
        for j in words[i:]:
            if (j == "-") and (words[j:1] == "v") and (words[j:2] == "."):
                my_dict.append(words[i:j])
                break

但上面的代码有很多错误，我宁愿人们不指出它们，而只是向我展示正确的方法。谢谢你的帮助。

Answer 1

如果我理解你的问题是正确的，你只能选择“-v”。这意味着字典不是必需的。此外，字典必须有唯一的键，所以你可能不想使用它 - 因为动物可能会出现几次。字典的典型用法是将动物名称作为关键字，将出现的数量作为值。你也试图切一个角色。当你说for i in words我将成为你的角色。同样适用于j。

这是代码的一个工作示例，其中包含列表而不是字典：

my_dict = []

for i in range(len(words)):
    if words[i]=="$":
        for j in range(len(words[i:])):
            if words[j] == "-" and words[j+1] == "v" and words[j+2] == ".":
                my_dict.append(words[i:j])
                break

print my_dict

Answer 2

如果-v始终为-v，您可以像这样使用regular expressions：

import re
s = "$dog -v. an animal that is often owned as a pet"
word = re.findall(r'\$(.* -v)', s)
d = {}
lst = word[0].split()
d[lst[0]] = lst[1]
print (d)

输出：

{'dog': '-v'}

Answer 3

不确定您的预期输出。

我假设您将拥有其他语音标签，例如-n。另外，由于你不清楚最后一本字典，我已经制作了两个版本，你可以选择一个符合你要求的版本。

你可以试试这个：

import re

sentence = '''
$dog -v. an animal that is often owned as a pet
$man -n. he is a man
$jump -v. he jumped blah blah
'''

animals = re.findall(r'\$(.*)(?=\.)', sentence)    #capture from $ to '.' 

posDict = {}         #dict stores POS tags as keys.. eg '-v':['dog','jump']
animalDict = {}      #dict stores animals as keys .. eg 'dog':['-v']

for item in animals:
    word, pos = item.split()

    if posDict.get(pos,'') == '' :
        posDict[pos] = []

    if animalDict.get(word,'') == '' :
        animalDict[word] = []

    posDict[pos].append(word)
    animalDict[word].append(pos)

<强>输出：

所以posDict现在把POS标签（动词，名词等）作为键。

>>> posDict
{'-v': ['dog', 'jump'], '-n': ['man']}

检索所有动词，如下所示：

>>> posDict['-v']
['dog', 'jump']

另一方面，如果您想要动物及其相关标签：

>>> animalDict
{'man': ['-n'], 'dog': ['-v'], 'jump': ['-v']}
>>> animalDict['dog']
['-v']

使用符合您要求的字典。不要同时使用它们！

如何根据字符串

3 个答案: