Python - XML解析不适用于嵌套for循环

时间:2018-03-04 12:07:36

标签: python xml parsing

我目前正在创建一个要求几个单词的系统,如果在XML文件中找到该单词的同义词,则替换它。

以下是代码:

<select id="selectList">
  <option value="">choose your array</option> 
  <option value="fruit" id="fruit">fruit</option>
  <option value="vegetables" id="vegetables">vegetables</option>
</select>

当wordproc的单词发送到xmlparse函数时,它不起作用。任何指导?或者我错过了一个关键点?任何帮助都会很棒!

编辑:这是一个简短的XML文件

def wordproc(self, word):

    lmtzr = nltk.WordNetLemmatizer()
    tokens = nltk.word_tokenize(word)
    tokens_lemma = [lmtzr.lemmatize(tokens) for tokens in tokens]
    tagged = nltk.pos_tag(tokens)
    chunking = nltk.chunk.ne_chunk(tagged)


    important_words = []
    unimportant_tags = ['MD', 'TO', 'DT', 'JJR', 'CC', 'VBZ']

    for x in chunking:

        if x[1] not in unimportant_tags:
            important_words.append(x[0])

    print(important_words)
    self.words = (important_words)
    print(self.words)
    self.loop = len(self.words)
    self.xmlparse(self.words, self.loop)

def xmlparse(self, words, loops):

    root = ElementTree.parse('data/word-test.xml').getroot()
    for i in range(loops):
        syn_loc = [word for word in root.findall('word') if word.findtext('mainword') == words]
        for nym in syn_loc:
            print(nym.attrib)
            word_loop = self.loop
            new_word = (nym.findtext('synonym'))
            words = new_word
    print(words)
    vf = videoPlay()
    vf.moviepy(words)

我想要的结果:

<synwords>
<word>
    <mainword>affection</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>sweetie</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>appreciation</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>beloved</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>
<word>
    <mainword>emotion</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>

与XML比较后,结果将是

words = ["beloved", "sweetie","affection"]

1 个答案:

答案 0 :(得分:1)

每次我建议您可以在python词典中映射单词和同义词,而不是在xml中查找单词并解析它,然后您可以根据需要轻松查找或操作。我使用beautifulsoup来解析下面的xml:

xml = """<synwords>
<word>
    <mainword>affection</mainword>
    <wordtag>N</wordtag>
    <synonym>love</synonym>
</word>

.
.
.

<synwords>"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(xml, "html.parser")  # xml is your xml content
words = soup.find_all('word')
mapped_dict = {word.find("mainword").text: word.find("synonym").text for word in words}
print(mapped_dict)

输出:

{'sweetie': 'love', 'beloved': 'love', 'appreciation': 'love', 'affection': 'love', 'emotion': 'love'}