我正在尝试运行这个脚本来抓取环境中的rss提要" Thonny"但我只是继续收到" IndexError:列表索引超出范围"
的错误Traceback (most recent call last):
File "C:\Users\uri\rssfeedfour.py", line 11, in <module>
url = sys.argv[1]
IndexError: list index out of range
如何解决此问题,以防止一遍又一遍地出现此错误。我不知道如何解决这个问题因为我是初学者。我是否需要定义它,如果是这样的话?或者我可以把它拿出来朝不同的方向走?这是代码。
import feedparser
import time
from subprocess import check_output
import sys
#feed_name = 'TRIBUNE'
#url = 'http://chicagotribune.feedsportal.com/c/34253/f/622872/index.rss'
feed_name = sys.argv[1]
url = sys.argv[2]
db = 'http://feeds.feedburner.com/TheHackersNews'
limit = 12 * 3600 * 1000
current_time_millis = lambda: int(round(time.time() * 1000))
current_timestamp = current_time_millis()
def post_is_in_db(title):
with open(db, 'r') as database:
for line in database:
if title in line:
return True
return False
def post_is_in_db_with_old_timestamp(title):
with open(db, 'r') as database:
for line in database:
if title in line:
ts_as_string = line.split('|', 1)[1]
ts = long(ts_as_string)
if current_timestamp - ts > limit:
return True
return False
#
# get the feed data from the url
#
feed = feedparser.parse(url)
#
# figure out which posts to print
#
posts_to_print = []
posts_to_skip = []
for post in feed.entries:
# if post is already in the database, skip it
# TODO check the time
title = post.title
if post_is_in_db_with_old_timestamp(title):
posts_to_skip.append(title)
else:
posts_to_print.append(title)
#
# add all the posts we're going to print to the database with the current timestamp
# (but only if they're not already in there)
#
f = open(db, 'a')
for title in posts_to_print:
if not post_is_in_db(title):
f.write(title + "|" + str(current_timestamp) + "\n")
f.close
#
# output all of the new posts
#
count = 1
blockcount = 1
for title in posts_to_print:
if count % 5 == 1:
print("\n" + time.strftime("%a, %b %d %I:%M %p") + ' ((( ' + feed_name + ' - ' + str(blockcount) + ' )))')
print("-----------------------------------------\n")
blockcount += 1
print(title + "\n")
count += 1
答案 0 :(得分:0)
sys.argv
是Python中的一个列表,其中包含传递给脚本的命令行参数。 sys.argv[0]
包含脚本的名称,sys.argv[1]
包含第一个参数,依此类推。
要防止此错误,您需要在启动脚本时提供命令行参数。例如,您可以通过
启动此脚本而不会出现任何错误python rssfeedfour.py TRIBUNE http://chicagotribune.feedsportal.com/c/34253/f/622872/index.rss
如果您不提供任何命令行参数,也可以修改脚本以使其使用默认参数。
try:
feed_name = sys.argv[1]
except IndexError:
feed_name = 'TRIBUNE'
try:
url = sys.argv[2]
except IndexError:
url = 'http://chicagotribune.feedsportal.com/c/34253/f/622872/index.rss'
您可以详细了解如何处理错误here。
虽然使用argparse库更方便。