我有以下XPath表达式:
//a[@attribute='my-attribute']
当我在XPath搜索的HTML中具有以下元素时,它会按预期匹配:
<a attribute="my-attribute">Some text</a>
但是,如果该元素下有一个<svg>
标记,则XPath不返回匹配项:
<a attribute="my-attribute">
<svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
viewBox="0 0 24 24" focusable="false"></svg>
</a>
为什么在这种情况下XPath不匹配?有什么方法可以修改表达式以使其匹配?
编辑:
显然,它与<svg>
元素上的名称空间有关。使用local-name()
函数使其与我使用的XPath测试器相匹配:
//*[local-name()='a' and @attribute='my-attribute']
但是,当通过Selenium WebDriver运行时,这仍然不匹配。关于如何与Selenium一起工作的任何想法吗?
答案 0 :(得分:3)
您可能会对XPath托管环境如何呈现选定的a
元素感到困惑。
在svg
元素中添加a
元素将 不 影响
//a[@attribute='my-attribute']
对于
<a attribute="my-attribute">Some text</a>
a
元素的字符串值不仅包含空格字符,而且包含
<a attribute="my-attribute">
<svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
viewBox="0 0 24 24" focusable="false"></svg>
</a>
a
元素的字符串值仅包含空格,因此对于所选文本结果,您将看不到任何选定内容。
如果您评估count(//a[@attribute='my-attribute'])
,则两种情况下的结果都可能相同。
答案 1 :(得分:0)
以下是vb.net中可能的解决方案。
Public Class XmlNodeListWithNamespace
' see https://stackoverflow.com/questions/55385520/xpath-doesnt-match-when-desired-element-contains-child-elements
' @JaSON I would have thought the same thing,
' but removing the xmlns attribute from the svg tag
' causes the //a[@attribute='my-attribute'] expression to match.
' – Andrew Mairose
' Mar 28 at 13:07 "Asked 5 months ago Active 5 months ago" implies 2019-03-28 13:07.
' Therefore, I first considered deleting all occurrences of
' xmlns="" and xmlns="http://www.w3.org/1999/xhtml"
' I did this using the following Replacement.
' gstrHtml = Regex.Replace(
' input:=gstrHtml,
' pattern:=" *xmlns=""[^""]*""",
' replacement:="",
' options:=RegexOptions.IgnoreCase
' )
' However, the solution below retains the namespace, while avoiding unsightly xpath strings.
''' <summary>
''' For a given xpath, returns an XmlNodeList, taking account of the xmlns namespace.
''' </summary>
''' <param name="oXmlDocument">The current XML document.</param>
''' <param name="xpath">A normal xpath string, without any namespace qualifier.</param>
''' <returns>The XmlNodeList for the given xpath.</returns>
Public Shared Function NodeList(
oXmlDocument As XmlDocument,
xpath As String
) As XmlNodeList
Dim strXpath As String = xpath
' Insert Namespace Qualifier. For example,
' "//pre" becomes "//x:pre"
' "/html/body/form/div/pre" becomes "/x:html/x:body/x:form/x:div/x:pre"
' "//div[@id='nv_bot_contents']/pre" becomes "//x:div[@id='nv_bot_contents']/x:pre"
' "//div[@id='nv_bot_contents']/pre[@data-xxx='X2']" becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx='X2']"
' "//div[@id='nv_bot_contents']/pre[@data-xxx]" becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx]"
' "//pre[@data-xxx]" becomes "//x:pre[@data-xxx]"
strXpath = Regex.Replace(
input:=strXpath,
pattern:="(/)(\w+)",
replacement:="$1x:$2"
)
' See https://stackoverflow.com/questions/40796231/how-does-xpath-deal-with-xml-namespaces/40796315#40796315
Dim oXmlNamespaceManager As New XmlNamespaceManager(nameTable:=oXmlDocument.NameTable)
oXmlNamespaceManager.AddNamespace("x", "http://www.w3.org/1999/xhtml")
Dim oXmlNodeList As XmlNodeList = oXmlDocument.SelectNodes(
xpath:=strXpath,
nsmgr:=oXmlNamespaceManager
)
Return oXmlNodeList
End Function
End Class
示例调用:
Dim oXmlNodeList As XmlNodeList =
XmlNodeListWithNamespace.NodeList(
oXmlDocument:=oXmlDocument,
xpath:="//pre"
)