我正在尝试使用beautifulsoup
基于data-property读取和列出td中的文本 tr=BeautifulSoup(str(input),'lxml')
tags=tr.findAll('td')
for t in tags:
if t.attrs['data-property']== 'OSVersion':
ver=t.text
这给了我错误,没有细节
KeyError: 'data-property'
请参阅以下作为输入的示例tr
<tr >
<td class=" resizable reorderable" data-property="OSVersion">10.2.1</td>
<td class=" resizable reorderable" data-property="DisplayModel">iPad Mini 4 (64 GB Space Gray)</td>
<td class=" resizable reorderable" data-property="PhoneNumber"></td>
<td class="grid_customvariable_colsize resizable reorderable" data-property="DeviceCustomAttributeDetails"></td>
<td class=" resizable reorderable" data-property="DeviceTagDetails"></td>
<td class=" resizable reorderable" data-property="EnrollmentStatusName"> <div class="grid_resizable_col">Enrolled</div>
</td>
<td class=" resizable reorderable" data-property="ComplianceStatusName"> <div class="grid_resizable_col">Compliant</div>
</td>
<td class=" resizable reorderable" data-property="IMEI"></td>
<td class=" resizable reorderable" data-property="LocationGroupName">iOS</td>
<td class=" resizable reorderable" data-property="IsCompromisedYN">No</td>
<td class=" resizable reorderable" data-property="HomeCarrier">Not Reported </td>
<td class=" resizable reorderable" data-property="CurrentCarrier">Not Reported </td>
<td class=" resizable reorderable" data-property="WiFiIPAddress"></td>
<td class=" resizable reorderable" data-property="Notes"></td>
<td class=" resizable reorderable" data-property="WnsStatus"> <span>Disconnected</span>
</td>
<td class=" resizable reorderable" data-property="DmLastSeenTime"> <span class="icon arrow_down_stretched red">-</span>
</td>
</tr>
如果我接受单个dict如下,它可以正常工作
d={'class': ['', 'resizable', 'reorderable'], 'data-property': 'FriendlyName'}
print d['data-property']
任何人都知道如何解决它?
感谢
答案 0 :(得分:2)
无需弄乱attrs
:
from bs4 import BeautifulSoup as BS
html = """<tr >
<td class=" resizable reorderable" data-property="OSVersion">10.2.1</td>
<td class=" resizable reorderable" data-property="DisplayModel">iPad Mini 4 (64 GB Space Gray)</td>
<td class=" resizable reorderable" data-property="PhoneNumber"></td>
<td class="grid_customvariable_colsize resizable reorderable" data-property="DeviceCustomAttributeDetails"></td>
<td class=" resizable reorderable" data-property="DeviceTagDetails"></td>
<td class=" resizable reorderable" data-property="EnrollmentStatusName"> <div class="grid_resizable_col">Enrolled</div>
</td>
<td class=" resizable reorderable" data-property="ComplianceStatusName"> <div class="grid_resizable_col">Compliant</div>
</td>
<td class=" resizable reorderable" data-property="IMEI"></td>
<td class=" resizable reorderable" data-property="LocationGroupName">iOS</td>
<td class=" resizable reorderable" data-property="IsCompromisedYN">No</td>
<td class=" resizable reorderable" data-property="HomeCarrier">Not Reported </td>
<td class=" resizable reorderable" data-property="CurrentCarrier">Not Reported </td>
<td class=" resizable reorderable" data-property="WiFiIPAddress"></td>
<td class=" resizable reorderable" data-property="Notes"></td>
<td class=" resizable reorderable" data-property="WnsStatus"> <span>Disconnected</span>
</td>
<td class=" resizable reorderable" data-property="DmLastSeenTime"> <span class="icon arrow_down_stretched red">-</span>
</td>
</tr>"""
soup = BS(html)
tags=soup.findAll('td')
for t in tags:
if t['data-property'] == 'OSVersion':
ver=t.text
print(ver)
输出:
10.2.1
答案 1 :(得分:0)
在您的代码中执行以下更改,因为您获得了KeyError
:
if 'data-property' in t.attrs and t.attrs['data-property']== 'OSVersion':
我对演示代码的回答:
t.attrs
元组的返回列表。例如[(u'class', u' resizable reorderable'), (u'data-property', u'OSVersion')]
。
我们需要通过dict
方法转换为字典格式。例如attributes = dict(t.attrs)
在条件下,检查键是否存在。例如if 'data-property' in attributes and attributes['data-property']== 'OSVersion':
<强>演示:强>
import BeautifulSoup
tr = BeautifulSoup.BeautifulSoup(data)
tags = tr.findAll('td')
for t in tags:
attributes = dict(t.attrs)
if 'data-property' in attributes and attributes['data-property']== 'OSVersion':
ver = t.text
如果您还有任何问题,请与我们联系。免费打电话给我。
答案 2 :(得分:0)
在这里。 代码:
from bs4 import BeautifulSoup
with open("xmlfile.xml", "r") as f: # opening xml file
content = f.read() # xml content stored in this variable
soup = BeautifulSoup(content, "lxml")
for values in soup.findAll("td"):
if values["data-property"] == "OSVersion":
print values.text
输出:
10.2.1