Question

我有一个数据框myDF，其中一列我希望使用其他列条件的组合设置为零，并使用第二个数据帧，criteriaDF进行索引。

myDF.head（）：

       DateTime  GrossPowerMW USDateTime_string  DateTime_timestamp  \
0  01/01/1998 00:00        17.804  01/01/1998 00:00 1998-01-01 00:00:00   
1  01/01/1998 01:00        18.751  01/01/1998 01:00 1998-01-01 01:00:00   
2  01/01/1998 02:00        20.501  01/01/1998 02:00 1998-01-01 02:00:00   
3  01/01/1998 03:00        22.222  01/01/1998 03:00 1998-01-01 03:00:00   
4  01/01/1998 04:00        24.437  01/01/1998 04:00 1998-01-01 04:00:00   

   Month  Day  Hour  GrossPowerMW_Shutdown  
0      1    3     0                 17.804  
1      1    3     1                 18.751  
2      1    3     2                 20.501  
3      1    3     3                 22.222  
4      1    3     4                 24.437

criteriaDF：

       STARTTIME  ENDTIME
Month                    
1            9.0     12.0
2            9.0     14.0
3            9.0     14.0
4            9.0     14.0
5            9.0     13.0
6            9.0     14.0
7            9.0     13.0
8            9.0     12.0
9            9.0     14.0
10           9.0     13.0
11           9.0     13.0
12           9.0     11.0

myDF然后通过以下for循环运行：

month = 1
for month in range (1, 13):
    shutdown_hours = range(int(criteriaDF.iloc[month]['STARTTIME']), int(criteriaDF.iloc[month]['ENDTIME']))
    myDF.loc[(myDF["Month"].isin([month])) & (myDF["Hour"].isin(shutdown_hours)) & (myDF["Day"].isin(shutdown_days)), "GrossPowerMW_Shutdown"] *= 0
    month = month + 1

这给出了以下错误：

追踪（最近一次呼叫最后一次）：

文件“”，第1行，in       runfile（'myscript.py'，wdir ='C：myscript'）

文件   “C：\ ProgramData \ Anaconda2 \ LIB \站点包\ Spyder的\ utils的\网站\ sitecustomize.py”   第880行，在runfile中       execfile（filename，namespace）

文件   “C：\ ProgramData \ Anaconda2 \ LIB \站点包\ Spyder的\ utils的\网站\ sitecustomize.py”   第87行，在execfile中       exec（compile（scripttext，filename，'exec'），glob，loc）

文件“myscript.py”，第111行，in       gross_yield，curtailed_yield，shutdown_loss，df_testing = calculate_loss（input_file，input_shutdownbymonth，shutdown_days）   #Returning df仅用于测试/审讯。完成后删除。

在calculate_loss中输入第79行的文件“myscript.py”       shutdown_hours = range（int（criteriaDF.iloc [month] ['STARTTIME']），int（criteriaDF.iloc [month] ['ENDTIME']））

文件   “C：\ ProgramData \ Anaconda2 \ LIB \站点包\大熊猫\核心\ indexing.py”   第1328行，在__getitem__       return self._getitem_axis（key，axis = 0）

文件   “C：\ ProgramData \ Anaconda2 \ LIB \站点包\大熊猫\核心\ indexing.py”   第1749行，在_getitem_axis中       self._is_valid_integer（键，轴）

文件   “C：\ ProgramData \ Anaconda2 \ LIB \站点包\大熊猫\核心\ indexing.py”   第1638行，在_is_valid_integer中       提升IndexError（“单位置索引器超出范围”）

IndexError：单个位置索引器超出范围

但是如果我设置了

，脚本就可以了

month = 0
for month in range (0, 12)

然而，这不符合我的数据帧在Column ['Month']上的索引，该列运行1 - 12而不是0 - ＆gt; 11。

确认我的理解是

range (1, 13)

返回

[1,2,3,4,5,6,7,8,9,10,11,12].

我还尝试用for = 12的for循环中的代码逐行手动运行代码。所以我不确定为什么在愤怒中使用month（1,13）不起作用，注意12是最高的列表范围中的整数（1,13）。

我的代码或方法有什么错误？

Answer 1

你正在使用iloc，这是“纯粹基于整数位置的索引，用于按位置选择”。所以它只计算从0到11的行数你应该使用loc来查看索引的值（所以1到12）

单个位置索引器超出范围迭代通过pandas数据帧

1 个答案: