我有一个索引的pandas数据帧。通过搜索其索引,我发现了一排感兴趣。我如何找到这一行的iloc?
示例:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
A B C D
2000-01-01 -0.077564 0.310565 1.112333 1.023472
2000-01-02 -0.377221 -0.303613 -1.593735 1.354357
2000-01-03 1.023574 -0.139773 0.736999 1.417595
2000-01-04 -0.191934 0.319612 0.606402 0.392500
2000-01-05 -0.281087 -0.273864 0.154266 0.374022
2000-01-06 -1.953963 1.429507 1.730493 0.109981
2000-01-07 0.894756 -0.315175 -0.028260 -1.232693
2000-01-08 -0.032872 -0.237807 0.705088 0.978011
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row
Timestamp('2000-01-08 00:00:00', offset='D')
#which is the iloc of window_stop_row?
答案 0 :(得分:17)
您需要.name
属性并将其传递给get_loc
:
In [131]:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
Out[131]:
A B C D
2000-01-01 0.095234 -1.000863 0.899732 -1.742152
2000-01-02 -0.517544 -1.274137 1.734024 -1.369487
2000-01-03 0.134112 1.964386 -0.120282 0.573676
2000-01-04 -0.737499 -0.581444 0.528500 -0.737697
2000-01-05 -1.777800 0.795093 0.120681 0.524045
2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658
2000-01-07 -0.570800 0.248608 -1.428998 -0.662014
2000-01-08 -0.147326 0.717392 3.138620 1.208639
In [133]:
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row.name
Out[133]:
Timestamp('2000-01-03 00:00:00', offset='D')
In [134]:
df.index.get_loc(window_stop_row.name)
Out[134]:
2
get_loc
返回索引中标签的序号位置,这是您想要的:
In [135]:
df.iloc[df.index.get_loc(window_stop_row.name)]
Out[135]:
A 0.134112
B 1.964386
C -0.120282
D 0.573676
Name: 2000-01-03 00:00:00, dtype: float64
如果你只是想搜索索引,那么只要它被排序,你可以使用searchsorted
:
In [142]:
df.index.searchsorted('2000-01-04') - 1
Out[142]:
2
答案 1 :(得分:2)
虽然pandas.Index.get_loc()
仅在您拥有单个密钥时才有效,但以下范例也可用于获取多个元素的iloc
:
np.argwhere(condition).flatten() # array of all iloc where condition is True
在您的情况下,选择df.index < '2000-01-04'
:
np.argwhere(df.index < '2000-01-04').flatten()[-1] # returns 2
答案 2 :(得分:1)
您可以尝试循环遍历数据框中的每一行:
for row_number,row in dataframe.iterrows():
if row['column_header'] == YourValue:
print row_number
这将为您提供使用iloc函数的行
答案 3 :(得分:1)
IIUC你可以为你的案件打电话索引:
In [53]: df[df.index < '2000-01-04'].index[-1]
Out[53]: Timestamp('2000-01-03 00:00:00', offset='D')
修改强>
我认为@EdChums答案就是你想要的。或者,您可以使用获得的值过滤数据框,然后使用all
查找包含该值的行,然后将其传递给index
:
In [67]: df == window_stop_row
Out[67]:
A B C D
2000-01-01 False False False False
2000-01-02 False False False False
2000-01-03 True True True True
2000-01-04 False False False False
2000-01-05 False False False False
2000-01-06 False False False False
2000-01-07 False False False False
2000-01-08 False False False False
In [68]: (df == window_stop_row).all(axis=1)
Out[68]:
2000-01-01 False
2000-01-02 False
2000-01-03 True
2000-01-04 False
2000-01-05 False
2000-01-06 False
2000-01-07 False
2000-01-08 False
Freq: D, dtype: bool
In [69]: df.index[(df == window_stop_row).all(axis=1)]
Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')