我目前正在创建一个要求几个单词的系统,如果在XML文件中找到该单词的同义词,则替换它。
以下是代码:
<select id="selectList">
<option value="">choose your array</option>
<option value="fruit" id="fruit">fruit</option>
<option value="vegetables" id="vegetables">vegetables</option>
</select>
当wordproc的单词发送到xmlparse函数时,它不起作用。任何指导?或者我错过了一个关键点?任何帮助都会很棒!
编辑:这是一个简短的XML文件
def wordproc(self, word):
lmtzr = nltk.WordNetLemmatizer()
tokens = nltk.word_tokenize(word)
tokens_lemma = [lmtzr.lemmatize(tokens) for tokens in tokens]
tagged = nltk.pos_tag(tokens)
chunking = nltk.chunk.ne_chunk(tagged)
important_words = []
unimportant_tags = ['MD', 'TO', 'DT', 'JJR', 'CC', 'VBZ']
for x in chunking:
if x[1] not in unimportant_tags:
important_words.append(x[0])
print(important_words)
self.words = (important_words)
print(self.words)
self.loop = len(self.words)
self.xmlparse(self.words, self.loop)
def xmlparse(self, words, loops):
root = ElementTree.parse('data/word-test.xml').getroot()
for i in range(loops):
syn_loc = [word for word in root.findall('word') if word.findtext('mainword') == words]
for nym in syn_loc:
print(nym.attrib)
word_loop = self.loop
new_word = (nym.findtext('synonym'))
words = new_word
print(words)
vf = videoPlay()
vf.moviepy(words)
我想要的结果:
<synwords>
<word>
<mainword>affection</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
<word>
<mainword>sweetie</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
<word>
<mainword>appreciation</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
<word>
<mainword>beloved</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
<word>
<mainword>emotion</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
与XML比较后,结果将是
words = ["beloved", "sweetie","affection"]
答案 0 :(得分:1)
每次我建议您可以在python词典中映射单词和同义词,而不是在xml中查找单词并解析它,然后您可以根据需要轻松查找或操作。我使用beautifulsoup来解析下面的xml:
xml = """<synwords>
<word>
<mainword>affection</mainword>
<wordtag>N</wordtag>
<synonym>love</synonym>
</word>
.
.
.
<synwords>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(xml, "html.parser") # xml is your xml content
words = soup.find_all('word')
mapped_dict = {word.find("mainword").text: word.find("synonym").text for word in words}
print(mapped_dict)
输出:
{'sweetie': 'love', 'beloved': 'love', 'appreciation': 'love', 'affection': 'love', 'emotion': 'love'}