我有df:
Voltage
01-02-2017 00:00 13.1
01-02-2017 00:01 13.2
01-02-2017 00:02 13.3
01-02-2017 00:03 14.1
01-02-2017 00:04 14.3
01-02-2017 00:04 13.5
我想要第一个实例的时间(hh:mm),当电压列中的值> = 14.0时。 “完全充电时间”列中应该只有一个时间值。
Voltage Time of Full Charge
01-02-2017 00:00 13.1
01-02-2017 00:01 13.2
01-02-2017 00:02 13.3
01-02-2017 00:03 14.1 00:03
01-02-2017 00:04 14.3
01-02-2017 00:04 13.5
我正在尝试这些方面的东西,但无法弄明白:
df.index = pd.to_datetime(df.index)
df.['Time of Full Charge'] = np.where(df.['Voltage'] >= 14.0), (df.index.hour:df.index.minute))
答案 0 :(得分:6)
条件第一个索引值需要idxmax
,只有必要的索引必须是唯一的:
idx = (df['Voltage'] >= 14.0).idxmax()
df.loc[mask, 'Time of Full Charge'] = mask.idxmax().strftime('%H:%M')
print (df)
Voltage Time of Full Charge
2017-01-02 00:00:00 13.1 NaN
2017-01-02 00:01:00 13.2 NaN
2017-01-02 00:02:00 13.3 NaN
2017-01-02 00:03:00 14.1 00:03
2017-01-02 00:04:00 14.3 NaN
2017-01-02 00:04:00 13.5 NaN
或者:
idx = (df['Voltage'] >= 14.0).idxmax()
df['Time of Full Charge'] = np.where(df.index == idx, idx.strftime('%H:%M'), '')
print (df)
Voltage Time of Full Charge
2017-01-02 00:00:00 13.1
2017-01-02 00:01:00 13.2
2017-01-02 00:02:00 13.3
2017-01-02 00:03:00 14.1 00:03
2017-01-02 00:04:00 14.3
2017-01-02 00:04:00 13.5
对于非唯一索引,可以使用MultiIndex
:
df.index = [np.arange(len(df.index)), df.index]
idx = (df['Voltage'] >= 14.0).idxmax()
df['Time of Full Charge'] = np.where(df.index.get_level_values(0) == idx[0],
idx[1].strftime('%H:%M'),
'')
df.index = df.index.droplevel(0)
print (df)
Voltage Time of Full Charge
2017-01-02 00:00:00 13.1
2017-01-02 00:01:00 13.2
2017-01-02 00:02:00 13.3
2017-01-02 00:03:00 14.1 00:03
2017-01-02 00:04:00 14.3
2017-01-02 00:04:00 13.5
答案 1 :(得分:2)
如果Voltage
列已排序,您可以使用numpy.searchsorted():
In [260]: df.index[np.searchsorted(df.Voltage, 14)]
Out[260]: DatetimeIndex(['2017-01-02 00:03:00'], dtype='datetime64[ns]', freq=None)