执行代码时出现以下错误代码。错误不会立即发生 - 它会在2-7小时后随机发生。在发生错误之前,流式传输在线订阅源并将其写入数据库是没有问题的。
错误讯息:
Traceback (most recent call last):
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 78, in <module>
main()
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 63, in main
feed_iii = feed_load_iii(feed_url_iii)
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 44, in feed_load_iii
in feedparser.parse(feed_iii).entries]
IndexError: list index out of range
在这里您可以找到我的代码:
import feedparser
import MySQLdb
import time
from cookielib import CookieJar
db = MySQLdb.connect(host="localhost", # your host, usually localhost
user="root", # your username - SELECT * FROM mysql.user
passwd="****", # your password
db="sentimentanalysis_unicode",
charset="utf8") # name of the data base
cur = db.cursor()
cur.execute("SET NAMES utf8")
cur.execute("SET CHARACTER SET utf8")
cur.execute("SET character_set_connection=utf8")
cur.execute("DROP TABLE IF EXISTS feeddata_iii")
sql_iii = """CREATE TABLE feeddata_iii(III_ID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(III_ID),III_UnixTimesstamp integer,III_Timestamp varchar(255),III_Source varchar(255),III_Title varchar(255),III_Text TEXT,III_Link varchar(255),III_Epic varchar(255),III_CommentNr integer,III_Author varchar(255))"""
cur.execute(sql_iii)
def feed_load_iii(feed_iii):
return [(time.time(),
entry.published,
'iii',
entry.title,
entry.summary,
entry.link,
(entry.link.split('=cotn:')[1]).split('.L&id=')[0],
(entry.link.split('.L&id=')[1]).split('&display=')[0],
entry.author)
for entry
in feedparser.parse(feed_iii).entries]
def main():
feed_url_iii = "http://www.iii.co.uk/site_wide_discussions/site_wide_rss2.epl"
feed_iii = feed_load_iii(feed_url_iii)
print feed_iii[1][1]
for item in feed_iii:
cur.execute("""INSERT INTO feeddata_iii(III_UnixTimesstamp, III_Timestamp, III_Source, III_Title, III_Text, III_Link, III_Epic, III_CommentNr, III_Author) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)""",item)
db.commit()
if __name__ == "__main__":
while True:
main()
time.sleep(240)
如果您需要更多信息,请随时询问。我需要你的帮助!
来自伦敦的致谢和问候!
答案 0 :(得分:1)
从本质上讲,您的程序对于格式不佳的数据不够灵活。
您的代码对数据结构做出了非常明确的假设,如果数据结构不合理则无法应对。您需要检测数据格式不正确的情况,然后采取其他一些操作。
执行此操作的相当草率的方法只会捕获当前正在提出的异常,您可以使用(例如)
try:
feed_iii = feed_load_iii(feed_url_iii)
except IndexError:
# do something to report or handle the data format problem