Question

我正在编写一个简短的代码（我在python中的第一个代码）来过滤一个大表。

import sys

gwas_annot = open('gwascatalog.txt').read()
gwas_entry_list = gwas_annot.split('\n')[1:-1]

# paste line if has value
for lines in gwas_entry_list:
    entry_notes = lines.split('\t')
    source_name = entry_notes[7]
    if 'omega-6' in source_name:
        print(entry_notes)

基本上我想采取＆＃39; gwascatalog＆＃39;表格，将其解析为行和列，在第7列中搜索字符串（在这种情况下为＆＃39; omega-6＆＃39;）如果包含该字符串，则打印整行。

现在它将所有行打印到控制台但不允许我将其粘贴到另一个文件中。它也给了我错误：

Traceback (most recent call last):<br>
  File "gwas_parse.py", line 9, in <module><br>
    source_name = entry_notes[7]<br>
IndexError: list index out of range

不确定为什么会出错。有什么明显要解决的问题吗？

编辑：从数据中添加代码段。

enter image description here

Answer 1

您可以先查看列表的长度来保护自己。

if len(entry_notes) > 7:
    source_name = entry_notes[7]

Answer 2

列表索引超出范围可能是您遇到少于7列的行（行）。

    # index      0      1     2       3      4      5      6       (... no 7)
columnsArray = ['one', 'two','three','four','five','six', 'seven']

所以在这里，如果你要求数组[7]，你得到一个＆＃34;列表索引超出范围＆＃34;错误，因为for循环当前所在的行仅上升到索引6。

错误告诉你它发生在＆＃34;第9行＆＃34;，这是＆＃34; source_name = entry_notes [7]＆＃34;。我建议打印出表格中每行的列数。您可能会注意到某处有7列而不是8列。我还认为您的意思是说第8列，但是位置（或索引7），因为在python中计数从0开始。

也许添加另一个＆＃34; if＆＃34;只查找len（）为8或更多的行。

列表索引超出范围 - Python

2 个答案: