我正在尝试拆分以下标签:
<h3><a href="#AC Adapter" onclick="getProductsBasedOnCategoryID('Asus','AC Adapter','ET1611PUT','6941', this, 'E Series')">AC Adapter
</a></h3>
使用以下代码:
print "FETCHING CATEGORY"
atag = s.h3
for data in atag:
while getattr(atag, 'name', None) != 'h3':
atag = atag.nextSibling
atag.a
atag = literal_eval('(' + atag.nextSibling.replace(', this', '').split('(', 1)[1])
print atag
但是,我收到以下错误:
File "//CPSBS/RedirectedFolders/aysha/My Documents/asus_tables(edited) a tags.py", line 84, in <module>
atag = literal_eval('(' + atag.nextSibling.replace(', this', '').split('(', 1)[1])
IndexError: list index out of range
我猜我做错了什么?此a
标记还有一个我想要访问的onclick
属性,那么如何将其输入到以下代码中?
这是我正在解析
数据的网址http://www.asusparts.eu/partfinder/Asus/All在One / E系列中
[编辑]
Navigational Tree我正在尝试从
中检索数据<div id="accordion" class="ui-accordion ui-widget ui-helper-reset ui-accordion-icons" style="width: 760px;" role="tablist">
<h3 class="ui-accordion-header ui-helper-reset ui-state-active ui-corner-top" role="tab" aria-expanded="true" aria-selected="true" tabindex="0">
<span class="ui-icon ui-icon-triangle-1-s"></span>
<a onclick="getProductsBasedOnCategoryID('Asus','AC Adapter','ET10B','6941', this, 'E Series')" href="#AC Adapter" tabindex="-1" loaded="Loaded">AC Adapter </a>
</h3>
<div id="6941" class="ui-accordion-content ui-helper-reset ui-widget-content ui-corner-bottom ui-accordion-content-active" role="tabpanel" style="display: block;">
<table class="productTableList">
<tbody>
</table>
<table class="productTableList">
<tbody>
<tr style="height:90px;background-color:#ebf4ff;">
<td class="ProduktLista" width="70px">
<td class="ProduktLista" width="315">
<a onclick="getProductInformationModal("Asus","14G110008340");">
<br>
答案 0 :(得分:1)
当你遇到这些类型的问题时,你无法立即看到问题所在,那么你需要将复杂的表达式分开。而不是:
atag = literal_eval('(' + atag.nextSibling.replace(', this', '').split('(', 1)[1])
将其重写为(当然,您应该使用更有意义的变量名称):
nextSibling = atag.nextSibling
txt1 = nextSibling.replace(', this', '')
split = txt1.split('(', 1)
txt2 = split[1]
txt3 = '(' + txt2
atag = literal_eval(txt3)
这将为您提供存在问题的确切表达式,并且所涉及的值的打印声明应该为您提供答案..