使用python解析文本文件:列表索引超出范围

时间:2015-04-21 12:38:37

标签: python-2.7 readfile

您好我已经编写了一个python程序来解析来自txt文件的数据特定数据 我的代码是:

f = open('C:/Users/aikaterini/Desktop/Ericsson_PARSER/BSC_alarms_vf_OSS.txt','r')
from datetime import datetime
import MySQLdb



def firstl():

    with f as lines:
        lines = lines.readlines()
        print len(lines)
        for i,line in enumerate(lines):
            if line.startswith("*** Disconnected from"):
                conline = line.split()
                bsc = conline[-2]
                print "\n"*5
                print bsc
                print "*"*70
                break


        for i,line in enumerate(lines):
            if line.startswith("*** Connected to"):
                conline = line.split()
                bsc = conline[-2]
                print "\n"*5
                print bsc
                print "*"*70

            elif line[:3] == "A1/" or line[:3] == "A2/":

                if  lines[i+1].startswith("RADIO"):
                    fal = line.split()
                    first_alarm_line = [fal[0][:2],fal[-2],fal[-1]]
                    year = first_alarm_line[1][:2]
                    month = first_alarm_line[1][2:4]
                    day = first_alarm_line[1][4:]
                    hours = first_alarm_line[2][:2]
                    minutes = first_alarm_line[2][2:]
                    date = datetime.strptime( day + " " + month + " " + year + " " + \
                                              hours+":"+minutes,"%d %m %y %H:%M")


                    print first_alarm_line
                    print date, "\n"
                    print lines[i+1]
                    print lines[i+4]
                    print lines[i+5]
                    desc_line = lines[i+4]
                    desc_values_line = lines[i+5]
                    desc = desc_line.split(None,2)
                    print desc

                    desc_values = desc_values_line.split(None,2)
                    rsite = ""

                    #for x in desc_values[1]:
                     #   if not (x.isalpha() or x == "0"):
                      #      rsite += x
                    rsite = desc_values[1].lstrip('RBS0')
                    print "\t"*2 + "rsite:" + rsite 

                    if desc[-1] == "ALARM SLOGAN\n":
                        alarm_slogan = desc_values[-1]
                        print alarm_slogan



                    x = i
                    print x # to check the line
                    print len(line) #check length of lines
                    while not lines[x].startswith("EXTERNAL"):
                        x+=1
                    if lines[x].startswith("EXTERNAL"):
                        while not lines[x] == "\n":
                            print lines[x]
                            x+=1


                    print "\n"*5


                elif lines[i+1].startswith("CELL LOGICAL"):
                    fal = line.split()
                    first_alarm_line = [fal[0][:2],fal[-2],fal[-1]]
                    #print i
                    print first_alarm_line

                    type = lines[i+1]
                    print type
                    cell_line = lines[i+3]
                    cell = cell_line.split()[0]
                    print cell
                    print "\n"*5



          ##########Database query###########

            #db = MySQLdb.connect(host,user,password,database)





    firstl()

当我运行程序时,结果是正确的 但它打印到50672行,而有51027 我得到最后打印的结果,出现以下错误:

['A2', '130919', '0309']
2013-09-19 03:09:00 

RADIO X-CEIVER ADMINISTRATION

MO                                 RSITE           ALARM SLOGAN

RXOCF-18                           RBS03668        OML FAULT

['MO', 'RSITE', 'ALARM SLOGAN\n']
    rsite:3668
OML FAULT

50672
51027

Traceback (most recent call last):
File "C:\Python27\parser_v3.py", line 106, in <module>
 firstl()
File "C:\Python27\parser_v3.py", line 72, in firstl
   while not lines[x].startswith("EXTERNAL"):
 IndexError: list index out of range

如果我发表评论而不是我得到的行:

Traceback (most recent call last):
 File "C:\Python27\parser_v3.py", line 106, in <module>
 firstl()
  File "C:\Python27\parser_v3.py", line 60, in firstl
   rsite = desc_values[1].lstrip('RBS0')
  IndexError: list index out of range

txt内容如下:

 A1/EXT "FS G11B/25/13/3" 382 150308   1431      
RADIO X-CEIVER ADMINISTRATION
BTS EXTERNAL FAULT

MO                RSITE            CLASS
RXOCF-16          RBS02190         1

EXTERNAL ALARM
ALARM SYSTEM ON/OFF    G2190 DRAMA CNR                        



A1/EXT "FS G11B/25/13/3" 755 150312   1434      
RADIO X-CEIVER ADMINISTRATION
BTS EXTERNAL FAULT

MO                RSITE            CLASS
RXOCF-113         RBS00674         1

EXTERNAL ALARM
IS.BOAR FAIL    G0674 FALAKRO

我不明白,因为我用maxnumber 2进行拆分,我得到了3个元素,你可以看到我正在选择第2个,如果我评论说当我从列表中选择一个元素时我得到另一个错误而事实是返回正确的结果。请帮助我。 对不起,发帖很长,谢谢你。

2 个答案:

答案 0 :(得分:0)

我还没有深入挖掘您的代码,但您是否尝试在尝试访问该索引之前验证x是否超过lines中的元素数量?另外,为了便于阅读,我建议使用lines[x] !=而不是not lines[x] ==

while x < len(lines) and lines[x] != "\n":

答案 1 :(得分:0)

我解决了它,虽然我不知道它是否正确但是有效。 我认为问题是x超出了包含文件的列表行的长度,并且在拆分后必须检查列表的长度是否大于或等于元素的数量所以:

           if len(desc_values) > 2 and len(desc) > 2:
                    rsite = desc_values[1].lstrip('RBS0')
                    print "\t"*2 + "rsite:" + rsite 

                    if desc[-1] == "ALARM SLOGAN\n":
                        alarm_slogan = desc_values[-1]
                        print alarm_slogan



                x = i
                print x #to check the line
                print len(lines) # check length of lines
                while [x] < len(lines): #check so that x doesnt exceed the length of file list "line"

                    while not lines[x].startswith("EXTERNAL"):
                        x+=1
                    if lines[x].startswith("EXTERNAL"):
                        while lines[x] != "\n":
                            print lines[x]
                            x+=1

谢谢你,你真的帮了我,虽然我试图找到一种方法来停止x的迭代以获得一些计算时间我试图打破但它会完全抛出你的循环。 不管怎样,谢谢