在没有外部库的情况下将CSS选择器转换为python中的XPath选择器

时间:2017-12-04 16:49:35

标签: python css python-3.x xpath

是否可以转换复杂的CSS选择器,例如:

@[class="maintitle"] > .main-wrap #myid > .h1 bold

使用Python 3中的XPath 而不使用外部库?还是用正则表达式?

我目前能够转换

@[class="maintitle"]

"//*[contains(@class,'maintitle')]"但无法创建全局规则来转换这些更复杂的选择器。它甚至可能吗?

编辑:我不能使用cssselect。

1 个答案:

答案 0 :(得分:2)

尝试以下XPath

Fruits <- structure(list(order.id = c(1001L, 1002L, 1005L, 1004L, 1003L, 
1006L, 1007L, 1008L, 1010L, 1009L), customer.id = structure(c(2L, 
4L, 3L, 2L, 2L, 3L, 1L, 2L, 4L, 2L), .Label = c("Aldridge", "J Car Ltd", 
"Richardson", "Som Comp"), class = "factor"), Apples = c(1L, 
0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 1L), Peaches = c(0L, 2L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L), Pears = c(0L, 0L, 1L, 0L, 0L, 1L, 
1L, 1L, 0L, 0L)), .Names = c("order.id", "customer.id", "Apples", 
"Peaches", "Pears"), class = "data.frame", row.names = c(NA, 
-10L))

如果您需要可以将CSS转换为XPath的工具,可以尝试lxml

//*[@class="maintitle"]/*[contains(@class, "main-wrap")]//*[@id="myid"]/*[contains(@class="h1")]//bold

输出(看起来很奇怪,但......):

from cssselect import GenericTranslator
from lxml.etree import XPath

css_selector = """[class="maintitle"] > .main-wrap #myid > .h1 bold"""
print(XPath(GenericTranslator().css_to_xpath(css_selector)))

请注意,您可能还需要在开始时添加descendant-or-self::*[@class = 'maintitle']/*[@class and contains(concat(' ', normalize-space(@class), ' '), ' main-wrap ')]/descendant-or-self::*/*[@id = 'myid']/*[@class and contains(concat(' ', normalize-space(@class), ' '), ' h1 ')]/descendant-or-self::*/bold

//