Question

我正在研究留一法的模型验证过程。当我循环运行而将列表中的一项留给测试时，它在i = 19时停止。但是，当我以i = 19手动逐一运行时，它运行良好。功能的长度为36。

for i in range(len(features)):
        # i = 18
        w_count = word_count[i]
        x_test_c = features[i][['count']].copy()
        x_test = features[i]
        x_test.drop('count', axis=1, inplace=True)
        x_train_list = features
        x_train_list.pop(i)
        y_test = summaries[i]
        y_train_list = summaries
        y_train_list.pop(i)

        x_train = merge_data(x_train_list)
        x_train.drop('count', axis=1, inplace=True)
        y_train = merge_data(y_train_list)
        print(x_train.shape,"\t",y_train.shape)
        print(x_test.shape,"\t",y_test.shape)

        model = sm.OLS(y_train, x_train, missing='drop').fit()

        predictions = model.predict(x_test)
        predictions = predictions.sort_values(ascending=False)

        print("\n\nLeave one out cross validation \nTest report:",i+1)
        match(predictions, w_count, x_test_c, y_test)

样本输出是这样的。

(sysenv) D:\pythonprojects\rec_proj>python main.py 
Leave one out cross validation
Test report: 1
total word count of report: 509
summary word count: ~ 127.25
['2.4', '1.5', '3.2']
Precision= 1.0
Recall= 0.21428571428571427
F1= 0.35294117647058826
....
Leave one out cross validation
Test report: 18
total word count of report: 380
summary word count: ~ 95.0
['5.3', '12.2', '1.14', '5.2']
Precision= 0.75
Recall= 0.12
F1= 0.20689655172413793

在此迭代后停止。 Erorr就是这样。

Traceback (most recent call last):
  File "main.py", line 49, in <module>
    lou(df_len, df_summary, word_count)
  File "D:\pythonprojects\rec_proj\model_eval.py", line 33, in lou
    x_test_c = features[i][['count']].copy()
IndexError: list index out of range

但是如果我插入i = 18

Leave one out cross validation
Test report: 19
total word count of report: 741
summary word count: ~ 185.25
['3.10', '10.1', '2.2', '4.1', '5.3', '2.4']
Precision= 0.8333333333333334
Recall= 0.22727272727272727
F1= 0.35714285714285715

因此发现该循环在18、27、30、33、35处失败。我无法调试此错误，因为当手动插入这些值时，它工作正常。

Answer 1

在Python中，String将产生从range(n)到0的所有数字。为了使您的问题形象化，请设想一下我们有一个简单的程序：

输出（用空格替换换行符）将是：

array = [0, 1, 2, 3, 4]
for m in range(len(array)): # len(array) evaluates to 5
    print(m) 
for n in array:
    print(n)

如您所见，0 1 2 3 4 5 0 1 2 3 4比数组的长度还远。这是您的代码所在的位置。在第一行中，您将启动一个for循环，该循环将遍历range(len(array))，但是在第四行中，您将访问range(len(features))。因此，在循环的最后一次迭代中，features[i]超出了数组的长度，Python抛出了错误，因为代码尝试访问range中不存在的元素。

循环迭代时列表索引超出范围-Python

1 个答案: