Python ::如何按标签拆分xml字符串?

时间:2018-05-03 15:51:36

标签: python xml

我有一个字符串(在名为token.txt的文件中),其中包含以下文字。

<nexttoken>test1</nexttoken>
<nexttoken>test2</nexttoken>

我想删除标记<nexttoken>,我想显示第二行,即test2

输出= test2

我尝试了什么:

with open("token.txt") as f:   
        for line in f:
            if "nexttoken" in line: 
                lines_contain_next_token = line 
                n2=lines_contain_next_token.replace("</nexttoken>","\n")
                n3=n2.replace("<nexttoken>","\n")
                n4=n3.replace("\n",",")
                n5=n4.replace(' ','')
                print n5

1 个答案:

答案 0 :(得分:0)

如果文本中只有两个nexttoken标记,则可以使用正则表达式提取所需的值。

例如:

import re
with open(filename, "r") as infile:
    data = infile.read()
c = re.findall("\<nexttoken\>(.*)\<\/nexttoken\>", data)
print(c[1])

<强>输出:

test2

注意:如果您的src文件是XML文件,我强烈建议您使用python xml解析器。