根据多个标记文本查找父标记
考虑我在文件中有一部分xml,如下所示:
<Client name="Jack">
<Type>premium</Type>
<Usage>unlimited</Usage>
<Payment>online</Payment>
</Client>
<Client name="Jill">
<Type>demo</Type>
<Usage>limited</Usage>
<Payment>online</Payment>
</Client>
<Client name="Ross">
<Type>premium</Type>
<Usage>unlimited</Usage>
<Payment>online</Payment>
</Client>
我正在使用BeautifulSoup来解析值。
这里我需要根据标签获取客户端名称。根据标签的文本,我需要获取客户端名称。(来自父标签)。
我的功能如下:
def get_client_for_usage(self, usage):
"""
To get the client name for specified usage
"""
usage_items = self.parser.findAll("client")
client_for_usage = []
for usages in usage_items:
try:
client_set = usages.find("usage", text=usage).findParent("client")
client_attr = dict(client_set.attrs)
client_name = client_attr[u'name']
client_for_usage.append(client_name)
except AttributeError:
continue
return client_for_usage
现在我需要获取客户端名称,但需要基于两个方面,即基于用法和类型。
所以我需要传递类型和用法,以便我可以获得客户端名称。
有人帮我一样。如果问题不明确,请告诉我,以便我可以根据需要进行编辑。
答案 0 :(得分:1)
类似
def get_client_for_usage(self, usage, tpe):
"""
To get the client name for specified usage
"""
usage_items = self.parser.findAll("client")
client_for_usage = []
for usages in usage_items:
try:
client_set = usages.find("usage", text=usage).findParent("client")
typ_node = usages.find("type", text=tpe).findParent("client")
if client_set == typ_node:
client_for_usage.append(client_set['name'])
except AttributeError:
continue
return client_for_usage
答案 1 :(得分:0)
bitwise operators
出:
html = '''<Client name="Jack">
<Type>premium</Type>
<Usage>unlimited</Usage>
<Payment>online</Payment>
</Client>
<Client name="Jill">
<Type>demo</Type>
<Usage>limited</Usage>
<Payment>online</Payment>
</Client>
<Client name="Ross">
<Type>premium</Type>
<Usage>unlimited</Usage>
<Payment>online</Payment>
</Client>'''
import bs4
import collections
soup = bs4.BeautifulSoup(html, 'lxml')
d = collections.defaultdict(list)
for client in soup('client'):
type_, usage, payment = client.stripped_strings
d[(type_, usage)].append(client['name'])
使用defaultdict(list,
{('demo', 'limited'): ['Jill'],
('premium', 'unlimited'): ['Jack', 'Ross']})
和type
作为关键字,将客户usage
作为值来构建name
,而不是通过访问{dict
获取name
{1}}。