未创建通过循环列出

时间:2020-01-19 16:35:08

标签: python scrapy

我正在尝试用循环构建一个URL列表,然后从每个URL抓取一个数据点,但似乎只对列表的最后一项(MMM)而不是对所有URL都这样做。 。 我究竟做错了什么?谢谢!

from simplified_scrapy.request import req
from simplified_scrapy.simplified_doc import SimplifiedDoc

tickers = ['AAPL','T','MMM']

for i in tickers:
    quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield']

data = []
for pg in quote_page:
  page = req.get(pg)
  doc = SimplifiedDoc(page)
  divyield = doc.select('.box boxRatio').getElementByText('Average').next.text
  data.append((divyield)[:-1])
print (data)

1 个答案:

答案 0 :(得分:1)

执行时

for i in tickers:
    quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield']

您进行一个循环,并在该循环的每次迭代中将新值(由1个元素组成的数组)分配给quote_page,而不是将新值附加到数组quote_page

您可以这样做:

quote_page = []
for i in tickers:
    quote_page.append('https://ycharts.com/companies/'+i+'/dividend_yield')

或者您可以使用@DarrylG在评论中建议的较短变体:

quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield' for i in tickers]