我想使用lxml / xpath来查找某些img元素,并在其src属性中编写一个简短的php脚本。像这样:
from lxml import html
htmldoc = html.document_fromstring(htmlstr)
imgs = htmldoc.xpath("//*[@class='someclass']/img")
imgs[0].attrib['src'] = "<?php echo get_img_file(); ?>"
processedHTML = html.tostring(htmldoc, pretty_print=True)
with open("test.php","w+") as outfile:
outfile.write(processedHTML.decode("utf-8"))
非法字符(例如&lt;和&gt;)作为html实体进行转义。有没有办法设置lxml以允许将这些字符写入文档?谢谢!