" IndexError:列表索引超出范围"同时收取MySQL DB

时间:2014-07-08 09:31:50

标签: python mysql feed

执行代码时出现以下错误代码。错误不会立即发生 - 它会在2-7小时后随机发生。在发生错误之前,流式传输在线订阅源并将其写入数据库是没有问题的。

错误讯息:

Traceback (most recent call last):
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 78, in <module>
main()
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 63, in main
feed_iii = feed_load_iii(feed_url_iii)
File "C:\Python27\MySQL_finalversion\RSS_common_FV.py", line 44, in feed_load_iii
in feedparser.parse(feed_iii).entries]
IndexError: list index out of range

在这里您可以找到我的代码:

import feedparser
import MySQLdb
import time
from cookielib import CookieJar

db = MySQLdb.connect(host="localhost", # your host, usually localhost
                 user="root", # your username - SELECT * FROM mysql.user
                 passwd="****", # your password
                 db="sentimentanalysis_unicode",
                 charset="utf8") # name of the data base

cur = db.cursor()
cur.execute("SET NAMES utf8")
cur.execute("SET CHARACTER SET utf8")
cur.execute("SET character_set_connection=utf8")
cur.execute("DROP TABLE IF EXISTS feeddata_iii")

sql_iii = """CREATE TABLE feeddata_iii(III_ID INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(III_ID),III_UnixTimesstamp integer,III_Timestamp varchar(255),III_Source varchar(255),III_Title varchar(255),III_Text TEXT,III_Link varchar(255),III_Epic varchar(255),III_CommentNr integer,III_Author varchar(255))"""

cur.execute(sql_iii)

def feed_load_iii(feed_iii):
return [(time.time(),
         entry.published,
         'iii',
         entry.title,
         entry.summary,
         entry.link,
         (entry.link.split('=cotn:')[1]).split('.L&id=')[0],
         (entry.link.split('.L&id=')[1]).split('&display=')[0],
         entry.author)
        for entry
        in feedparser.parse(feed_iii).entries]

def main():
feed_url_iii = "http://www.iii.co.uk/site_wide_discussions/site_wide_rss2.epl"

feed_iii = feed_load_iii(feed_url_iii)

print feed_iii[1][1]

for item in feed_iii:
    cur.execute("""INSERT INTO feeddata_iii(III_UnixTimesstamp, III_Timestamp, III_Source, III_Title, III_Text, III_Link, III_Epic, III_CommentNr, III_Author) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)""",item)
db.commit()

if __name__ == "__main__":
while True:
    main()
    time.sleep(240)

如果您需要更多信息,请随时询问。我需要你的帮助!

来自伦敦的致谢和问候!

1 个答案:

答案 0 :(得分:1)

从本质上讲,您的程序对于格式不佳的数据不够灵活。

您的代码对数据结构做出了非常明确的假设,如果数据结构不合理则无法应对。您需要检测数据格式不正确的情况,然后采取其他一些操作。

执行此操作的相当草率的方法只会捕获当前正在提出的异常,您可以使用(例如)

try:
    feed_iii = feed_load_iii(feed_url_iii)
except IndexError:
    # do something to report or handle the data format problem