Question

我拥有的是：

＆＃xA;＆＃xA;

 来自lxml import etree＆＃xA; myscript =“if（0＆lt; 1）{alert（\”Hello World） ！\“）;}”＆＃xA; html = etree.fromstring（“＆lt; script＆gt;＆lt; / script＆gt;”）＆＃xA;＆＃xA;用于html.findall中的元素（'// script'） ：＆＃XA; element.text = myscript＆＃xA;＆＃xA; result = etree.tostring（html）＆＃xA;

＆＃xA;＆＃xA;

我得到的是：

＆＃XA;＆＃XA;

 <代码>＆GT;＆GT;＆GT;结果＆＃xA;'＆lt; script＆gt; if（0＆amp; lt; 1）{alert（“Hello World！”）;}＆lt; / script＆gt;'＆＃xA;

＆＃xA ;＆＃xA;

我想要的是未转义 JavaScript：

＆＃xA;＆＃xA;

 ＆gt;＆gt;＆gt;结果＆＃xA;'＆lt; script＆gt; if（0＆lt; 1）{alert（“Hello World！”）;}＆lt; / script＆gt;'＆＃xA;

＆＃xA;

Answer 1

你做不到。 lxml.etree和ElementTree是XML解析器，因此无论您想要解析或创建它们，都必须是有效的XML。并且某些节点文本中未转义的<不是有效的XML。它是有效的HTML，但不是有效的XML。

这就是为什么在XHTML中，你通常必须在<script>标签内添加CDATA块，这样你就可以将任何放在那里，而不必担心有效的XML结构。

但在您的情况下，您只想生成HTML，为此，您应该使用HTML解析器。例如BeautifulSoup：

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<script></script>')
>>> soup.find('script').string = 'if(0 < 1){alert("Hello World!");}'
>>> str(soup)
'<script>if(0 < 1){alert("Hello World!");}</script>'

Answer 2

您的方法失败的原因是您尝试更改元素的“文本”内容，而您需要更改/插入/追加自己的元素，请参阅此内容样品：

"<LDAP://DC=top,DC=example,DC=com>;(&(objectCategory=person)(objectClass=user)(sn=%lastname%));sAMAccountName,sn,givenName,distinguishedname,userAccountControl,cn"

所以是的，你仍然可以在技术上使用lxml来插入元素。我建议使用In [1]: from lxml import html In [2]: myscript = "<script>if(0 < 1){alert(\"Hello World!\");}</script>" In [3]: template = html.fromstring("<script></script>") # just a quick hack to get the <script> element without <html> <head> In [4]: script_element = html.fromstring(myscript).xpath("//script")[0] # insert new element then remove the old one In [10]: for element in template.xpath("//script"): ....: element.getparent().insert(0, script_element) ....: element.getparent().remove(element) ....: In [11]: print html.tostring(template) <html><head><script>if(0 < 1){alert("Hello World!");}</script></head></html>而不是lxml.html因为etree对于html元素更友好。

如何将JavaScript插入<script>元素？

2 个答案: