Question

因此当我用其他任何“天空”*运行我的程序时它会起作用但是当我用SKY运行它时它不起作用。

import urllib
import re

newsymbolslist = ["NFLX", "GOOG", "VNR", "AAPL", "SKY"]

i=0
while i<len(newsymbolslist):
    url = ("http://www.nasdaq.com/symbol/" +newsymbolslist[i]+ "/real-time")
    htmlfile = urllib.urlopen(url)
    htmltext = htmlfile.read()
    regex = '<span id="quotes_content_left_OverallStockRating1_lblPercentage" class="comm_bullrating">(.+?)</span>'
    pattern = re.compile(regex)
    price = re.findall(pattern,htmltext)

    print (newsymbolslist[i] + " is: " + price[0])

    i+=1

* Sky是newsymbolslist列表中的最后一个符号

Answer 1

因为nasdaq.com没有返回包含SKY <span id="quotes_content_left_OverallStockRating1_lblPercentage" class="comm_bullrating">标记的网页

Answer 2

循环遍历列表的更加pythonic方式是：

import urllib
import re

newsymbolslist = ["NFLX", "GOOG", "VNR", "AAPL", "SKY"]

for symbol in newsymbolslist:
    url = ("http://www.nasdaq.com/symbol/" + symbol + "/real-time")
    htmlfile = urllib.urlopen(url)
    htmltext = htmlfile.read()
    regex = '<span id="quotes_content_left_OverallStockRating1_lblPercentage" class="comm_bullrating">(.+?)</span>'
    pattern = re.compile(regex)
    price = re.findall(pattern,htmltext)

    print (symbol + " is: " + price[0])

天空问题是该课程为comm_50rating而不是comm_bullrating

Answer 3

正则表达式为SKY提取的信息已损坏，如下所示：

import urllib
import re

newsymbolslist = ["NFLX", "GOOG", "VNR", "AAPL", "SKY"]

i=0
while i<len(newsymbolslist):
    url = ("http://www.nasdaq.com/symbol/" +newsymbolslist[i]+ "/real-time")
    htmlfile = urllib.urlopen(url)
    htmltext = htmlfile.read()
    regex = '<span id="quotes_content_left_OverallStockRating1_lblPercentage" class="comm_bullrating">(.+?)</span>'
    pattern = re.compile(regex)
    price = re.findall(pattern,htmltext)

    print (newsymbolslist[i] + " is: " + str (price))

    i+=1

索引超出范围与不同的列表

3 个答案: