我正在尝试用循环构建一个URL列表,然后从每个URL抓取一个数据点,但似乎只对列表的最后一项(MMM)而不是对所有URL都这样做。 。 我究竟做错了什么?谢谢!
from simplified_scrapy.request import req
from simplified_scrapy.simplified_doc import SimplifiedDoc
tickers = ['AAPL','T','MMM']
for i in tickers:
quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield']
data = []
for pg in quote_page:
page = req.get(pg)
doc = SimplifiedDoc(page)
divyield = doc.select('.box boxRatio').getElementByText('Average').next.text
data.append((divyield)[:-1])
print (data)
答案 0 :(得分:1)
执行时
for i in tickers:
quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield']
您进行一个循环,并在该循环的每次迭代中将新值(由1个元素组成的数组)分配给quote_page
,而不是将新值附加到数组quote_page
。
您可以这样做:
quote_page = []
for i in tickers:
quote_page.append('https://ycharts.com/companies/'+i+'/dividend_yield')
或者您可以使用@DarrylG在评论中建议的较短变体:
quote_page = ['https://ycharts.com/companies/'+i+'/dividend_yield' for i in tickers]