如何从onclick属性值的括号内提取数据?

时间:2018-12-31 14:08:35

标签: python beautifulsoup

是否可以从onclickanalysis(1644983)AsianOdds(1644983)之类的EuropeOdds(1644983)属性值中提取数据?我只想显示一个数字,因为此HTML代码中的所有数字都相同。

HTML

<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>

Python代码

from bs4 import BeautifulSoup

soup=BeautifulSoup("""<td style="word-spacing:-3px" align="left"> <a href="javascript:" onclick="analysis(1644983)">析</a><a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a> <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a></td>""",'html.parser')

lines=soup.find_all('onclick')
for line in lines:
    print(line['analysis'])

预期产量

1644983

1 个答案:

答案 0 :(得分:4)

我试图解释评论中的所有内容:

from bs4 import BeautifulSoup

html = '''<td style="word-spacing:-3px" align="left">
    <a href="javascript:" onclick="analysis(1644983)">析</a>
    <a href="javascript:" onclick="AsianOdds(1644983)" style="margin-left:3px;">亚</a>
    <a href="javascript:" onclick="EuropeOdds(1644983)" style="margin-left:3px;">欧</a>
    </td>'''

soup = BeautifulSoup(html, 'html.parser')

# Find all <a> elements
elements = soup.find_all('a')

# Loop over all found elements
for element in elements:
    # Disregard element if it doesn't contain onclick attribute
    if 'onclick' not in element.attrs:
        continue
    # Get attribute value
    value = element['onclick']
    # Disregard wrong elements
    if not value.startswith('analysis('):
        continue
    # Extract position of opening bracket
    position = value.index('(') + 1
    # Slice string so only content inside bracket is left
    value = value[position:-1]
    # Print result
    print(value)