在div类中的<a class> MYTEXT </a>之间获取文本

时间:2019-06-28 21:25:40

标签: python-2.7 web-scraping beautifulsoup

我有此源代码:

<div class="col-xs-12 col-sm-6 col-md-6">
<a class="btn btn-md white badge-success mt-5" 
href="https://stockinvest.us/trade/WRN" id="trade500signalsTop">
WRN is a Buy Candidate
</a>

我要打印“ WRN是购买候选商品”

我尝试了以下操作,但不起作用:

page2 = requests.get('https://stockinvest.us/technical-analysis/WRN')
soup2 = BeautifulSoup(page2.text, 'html.parser')
for link in soup2.find_all('a', id='trade500signalsTop'):
link_text = link.text
print link_text

1 个答案:

答案 0 :(得分:1)

在请求页面时使用header

import requests
from bs4 import BeautifulSoup
headers = {'User-Agent':
       'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

page2 = requests.get('https://stockinvest.us/technical-analysis/WRN',headers=headers)
soup2 = BeautifulSoup(page2.text, 'html.parser')
for link in soup2.find_all('a', id='trade500signalsTop'):
 link_text = link.text
 print(link_text)

输出:

WRN is a Buy Candidate

您可以使用find_all()代替find()来仅获取一个特定值。

print(soup2.find('a', id='trade500signalsTop').text)