网页抓取问题,r未定义?怎么修?

时间:2019-12-05 12:55:50

标签: python web-scraping jupyter-notebook

我正在尝试开发一款应用程序,该应用程序可以抓取我最喜欢的十大与太空相关的股票价格。但是

列表项 我的代码遇到了一些麻烦,我是新手。一旦使它起作用,我想将其放入一个csv文件中并使用它制作条形图,我希望获得一些帮助和建议。我也在Anaconda中这样做:

#import libraries 
import bs4
from bs4 import BeautifulSoup 
#grequests is a unique library that allows you to use many urls with ease
#must install qrequest in annacode use : conda install -c conda-forge grequests
#if you know a better way to do this, please let me know
import grequests

#scraping my top ten favorite space companies, attempted to pick compaines with pure play interest in space

urls = ['https://finance.yahoo.com/quote/GILT/', 'https://finance.yahoo.com/quote/LORL?p=LORL&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/I?p=I&.tsrc=fin-srch' , 'https://finance.yahoo.com/quote/VSAT?p=VSAT&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/RTN?p=RTN&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/UTX?ltr=1', 'https://finance.yahoo.com/quote/TDY?ltr=1', 'https://finance.yahoo.com/quote/ORBC?ltr=1', 'https://finance.yahoo.com/quote/SPCE?p=SPCE&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/BA?p=BA&.tsrc=fin-srch',]  
unsent_request = (grequests.get(url) for url in urls)

results = grequests.map(unsent_request)

下一个出现错误的地方是

def  parsePrice():
    soup = bs4.BeautifulSoup(r.text,"html")
    price=soup.find_all('div',{'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="52">4.1500'})[0].find('span').text
    return price

while True:
    print('current stock price: '+str(parsePrice()))

在anaconda中出现此错误之后:

NameError                                 Traceback (most recent call last)
<ipython-input-8-65e4abca95ee> in <module>
      1 while True:
----> 2     print('current stock price: '+str(parsePrice()))

<ipython-input-7-67b5742dffee> in parsePrice()
      1 def  parsePrice():
----> 2     soup = bs4.BeautifulSoup(r.text,"html")
      3     price=soup.find_all('div',{'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="52">4.1500'})[0].find('span').text
      4     return price

NameError: name 'r' is not defined

也请您告诉我这是否是将我的代码放入csv以及所需列的正确方法:

#add to csv file 
df_indu = pd.DataFrame(
    L['Top Ten Space Stocks'],
    columns=['stock name', 'stock price', 'date of listing'])
df_indu.to_csv('spacestocks.csv', index=False, sep='|')

我现在更担心错误,但是两个都很好,谢谢

3 个答案:

答案 0 :(得分:2)

您可以尝试这个


def  parsePrice(r):
        soup = bs4.BeautifulSoup(r.text,"html")
        price=soup.find_all('div',{'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="52">4.1500'})[0].find('span').text
        return price

for r in results:
   parsePrice(r)

答案 1 :(得分:1)

您只是没有定义“ r”变量或没有将其传递给函数。

#import libraries 
import bs4
from bs4 import BeautifulSoup 
#grequests is a unique library that allows you to use many urls with ease
#must install qrequest in annacode use : conda install -c conda-forge grequests
#if you know a better way to do this, please let me know
import grequests

#scraping my top ten favorite space companies, attempted to pick compaines with pure play interest in space

urls = ['https://finance.yahoo.com/quote/GILT/', 'https://finance.yahoo.com/quote/LORL?p=LORL&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/I?p=I&.tsrc=fin-srch' , 'https://finance.yahoo.com/quote/VSAT?p=VSAT&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/RTN?p=RTN&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/UTX?ltr=1', 'https://finance.yahoo.com/quote/TDY?ltr=1', 'https://finance.yahoo.com/quote/ORBC?ltr=1', 'https://finance.yahoo.com/quote/SPCE?p=SPCE&.tsrc=fin-srch', 'https://finance.yahoo.com/quote/BA?p=BA&.tsrc=fin-srch',]  
unsent_request = (grequests.get(url) for url in urls)

results = grequests.map(unsent_request)

for r in results:
       parsePrice(r)

def  parsePrice(r):
        soup = bs4.BeautifulSoup(r.text,"html")
        price=soup.find_all('div',{'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)" data-reactid="52">4.1500'})[0].find('span').text
        return price

答案 2 :(得分:-1)

我还没有真正使用过BeautifulSoup。但是看起来您需要读取一个文件。他们在您的代码中称其为r,实际上它并未在任何地方声明。