Question

import nltk
import random
from nltk.tokenize import sent_tokenize, word_tokenize


file = open("sms.txt", "r")
for line in file:
    #print line
    a=word_tokenize(line)
    if a[5] == 'SBI' and a[6]== 'Debit':
        print a[13]

任何人都可以帮我纠正错误。程序运行几行然后停止并给出索引超出范围错误。我理解错误，但我不知道如何解决它。我想基本上删除那些不可读的行。

Answer 1

只需添加list length支票即可解决问题。

if len(a) >= 14 and a[5] == 'SBI' and a[6]== 'Debit':
    print a[13]

Answer 2

您还可以在不影响流量/无错误的情况下跟踪不适当的行

    file = open("sms.txt", "r")
    for line_no,line in enumerate(file):
        a=word_tokenize(line)
        try:
            if a[5] == 'SBI' and a[6]== 'Debit':
                print a[13]
        except IndexError:
            print str(line_no)+" line doesn't have expected data"
            continue

我收到索引错误作为列表超出范围。我必须扫描很多行

2 个答案: