我有一个字符串(在名为token.txt的文件中),其中包含以下文字。
<nexttoken>test1</nexttoken>
<nexttoken>test2</nexttoken>
我想删除标记<nexttoken>
,我想显示第二行,即test2
输出= test2
我尝试了什么:
with open("token.txt") as f:
for line in f:
if "nexttoken" in line:
lines_contain_next_token = line
n2=lines_contain_next_token.replace("</nexttoken>","\n")
n3=n2.replace("<nexttoken>","\n")
n4=n3.replace("\n",",")
n5=n4.replace(' ','')
print n5
答案 0 :(得分:0)
如果文本中只有两个nexttoken
标记,则可以使用正则表达式提取所需的值。
例如:
import re
with open(filename, "r") as infile:
data = infile.read()
c = re.findall("\<nexttoken\>(.*)\<\/nexttoken\>", data)
print(c[1])
<强>输出:强>
test2
注意:如果您的src文件是XML文件,我强烈建议您使用python xml解析器。