我想抓取谷歌财务上列出的公司的名称,网址和描述。到目前为止,我成功获取描述和网址但无法获取名称。在myUrl的源代码中,名称是024 Pharma Inc.当我看到div时,该类被命名为' appbar-snippet-primary'。但代码仍然没有找到它。我是网络抓取新手,所以可能是我遗漏了一些东西。请指导我这方面。
from bs4 import BeautifulSoup
import urllib
import csv
myUrl = 'https://www.google.com/finance?q=OTCMKTS%3AEEIG'
r = urllib.urlopen(myUrl).read()
soup = BeautifulSoup(r, 'html.parser')
name_box = soup.find('div', class_='appbar-snippet-primary') # !! This div is not found
#name = name_box.text
#print name
description = soup.find('div', class_='companySummary')
desc = description.text.strip()
#print desc
website = soup.find('div', class_='item')
site = website.text
#print site
答案 0 :(得分:0)
from bs4 import BeautifulSoup
import requests
myUrl = 'https://www.google.com/finance?q=OTCMKTS%3AEEIG'
r = requests.get(myUrl).content
soup = BeautifulSoup(r, 'html.parser')
name = soup.find('title').text.split(':')[0] # !! This div is not found
#print name
description = soup.find('div', class_='companySummary')
desc = description.text.strip()
#print desc
website = soup.find('div', class_='item')
site = website.text
答案 1 :(得分:-1)
写下soup.find_all()而不是soup.find()