Question

我使用open

将我的txt文件导入为str

with open('./doc', 'r') as f:
dat = f.readlines()

然后我想通过使用for循环来清理数据

docs = []
for i in dat:
if i.strip()[0] != '<':
    docs.append(i)

错误返回

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-131-92a67082e677> in <module>()
      1 docs = []
      2 for i in dat:
----> 3     if i.strip()[0] != '<':
      4         docs.append(i)

IndexError: string index out of range

但是如果我改变这样的代码，只需选择前3000行，代码就可以了。

docs = []
for i in dat[:3000]:
if i.strip()[0] != '<':
    docs.append(i)

我的txt文件包含93408行，为什么我无法全部选择它们？ THX！

Answer 1

一行或多行可能为空，您需要先检查它，然后才能获取第一行

if i.strip() != "" and i.strip()[0] != '<':
    docs.append(i)

Python错误：“IndexError：字符串索引超出范围”

1 个答案: