无法从谷歌财务中删除名称

时间:2017-06-02 08:31:10

标签: python web-scraping

我想抓取谷歌财务上列出的公司的名称,网址和描述。到目前为止,我成功获取描述和网址但无法获取名称。在myUrl的源代码中,名称是024 Pharma Inc.当我看到div时,该类被命名为' appbar-snippet-primary'。但代码仍然没有找到它。我是网络抓取新手,所以可能是我遗漏了一些东西。请指导我这方面。

from bs4 import BeautifulSoup
import urllib
import csv

myUrl = 'https://www.google.com/finance?q=OTCMKTS%3AEEIG' 

r = urllib.urlopen(myUrl).read()
soup = BeautifulSoup(r, 'html.parser')

name_box = soup.find('div', class_='appbar-snippet-primary')  # !! This div is not found
#name = name_box.text  
#print name

description = soup.find('div', class_='companySummary') 
desc = description.text.strip()  
#print desc

website = soup.find('div', class_='item')  
site = website.text  
#print site 

2 个答案:

答案 0 :(得分:0)

from bs4 import BeautifulSoup
import requests

myUrl = 'https://www.google.com/finance?q=OTCMKTS%3AEEIG' 

r = requests.get(myUrl).content
soup = BeautifulSoup(r, 'html.parser')

name = soup.find('title').text.split(':')[0]  # !! This div is not found
#print name

description = soup.find('div', class_='companySummary') 
desc = description.text.strip()  
#print desc

website = soup.find('div', class_='item')  
site = website.text

答案 1 :(得分:-1)

写下soup.find_all()而不是soup.find()