我有以下数据框:
df_Valve = pd.DataFrame({'TimeStamp':['2018-01-01 00:00:00', '2018-01-01 00:00:05',
'2018-01-01 00:00:07', '2018-01-02 00:00:07',
'2018-01-02 00:00:08'],
'Sensor_Temp': [53, 66, 69, 69, 69],
'Sensor_StrainGauge': [0, 0, 0, 1, 1]})
df_Valve
TimeStamp Sensor_Temp Sensor_StrainGauge
2018-01-01 00:00:00 53 0
2018-01-01 00:00:05 66 0
2018-01-01 00:00:07 69 0
2018-01-02 00:00:07 69 1
2018-01-02 00:00:08 69 1
我需要在数据框中添加一个新列。此新列应包含位置0的“ TimeStamp”和位置1(第1行)的“ TimeStamp”之间的差。然后,位置1的“时间戳”与位置2的“时间戳”(第2行)之间的差异,依此类推。
所需的输出是:
TimeStamp Sensor_Temp Sensor_StrainGauge New_Columns
2018-01-01 00:00:00 53 0 0 days 00:00:05
2018-01-01 00:00:05 66 0 0 days 00:00:02
2018-01-01 00:00:07 69 0 1 days 00:00:00
2018-01-02 00:00:07 69 1 0 days 00:00:01
2018-01-02 00:00:08 69 1 0 days 00:00:00 #last index
我实现了以下代码(但不正确):
for i in range(0, len(df_Valve)):
for j in range(1, len(df_Valve)):
#difference between timestamp position 0 and 1, 1 and 2, 2 and 3 ...
df_Valve['New_Columns'] = abs(pd.to_datetime(df_Valve['TimeStamp'].iloc[i]) -
(pd.to_datetime(df_Valve['TimeStamp'].iloc[j])))
我的算法输出不正确,如下所示:
TimeStamp Sensor_Temp Sensor_StrainGauge New_Columns
2018-01-01 00:00:00 53 0 0 days
2018-01-01 00:00:05 66 0 0 days
2018-01-01 00:00:07 69 0 0 days
2018-01-02 00:00:07 69 1 0 days
2018-01-02 00:00:08 69 1 0 days
答案 0 :(得分:0)
使用pd.to_datetime 转换为日期时间,然后可以将Series.diff与Series.shift一起使用 为了获得所需的差异,最终fillna已用于 填写最后一个值:
df_Valve['TimeStamp']=pd.to_datetime(df_Valve['TimeStamp'])
df_Valve['New_Columns']=df_Valve['TimeStamp'].diff().shift(-1).fillna(pd.Timedelta(0))
print(df_Valve)
TimeStamp Sensor_Temp Sensor_StrainGauge New_Columns
0 2018-01-01 00:00:00 53 0 0 days 00:00:05
1 2018-01-01 00:00:05 66 0 0 days 00:00:02
2 2018-01-01 00:00:07 69 0 1 days 00:00:00
3 2018-01-02 00:00:07 69 1 0 days 00:00:01
4 2018-01-02 00:00:08 69 1 0 days 00:00:00