如何生成一列与时间戳之间的差异

时间:2019-10-07 22:43:59

标签: python dataframe

我有以下数据框:

    df_Valve = pd.DataFrame({'TimeStamp':['2018-01-01 00:00:00', '2018-01-01 00:00:05',
                                          '2018-01-01 00:00:07', '2018-01-02 00:00:07', 
                                          '2018-01-02 00:00:08'], 
                             'Sensor_Temp': [53, 66, 69, 69, 69],
                             'Sensor_StrainGauge': [0, 0, 0, 1, 1]})

    df_Valve


           TimeStamp           Sensor_Temp    Sensor_StrainGauge 
    2018-01-01 00:00:00           53                0
    2018-01-01 00:00:05           66                0
    2018-01-01 00:00:07           69                0
    2018-01-02 00:00:07           69                1
    2018-01-02 00:00:08           69                1

我需要在数据框中添加一个新列。此新列应包含位置0的“ TimeStamp”和位置1(第1行)的“ TimeStamp”之间的差。然后,位置1的“时间戳”与位置2的“时间戳”(第2行)之间的差异,依此类推。

所需的输出是:

          TimeStamp           Sensor_Temp    Sensor_StrainGauge      New_Columns
    2018-01-01 00:00:00       53                0                   0 days 00:00:05           
    2018-01-01 00:00:05       66                0                   0 days 00:00:02
    2018-01-01 00:00:07       69                0                   1 days 00:00:00
    2018-01-02 00:00:07       69                1                   0 days 00:00:01
    2018-01-02 00:00:08       69                1                   0 days 00:00:00   #last index

我实现了以下代码(但不正确):

    for i in range(0, len(df_Valve)):
        for j in range(1, len(df_Valve)):

            #difference between timestamp position 0 and 1, 1 and 2, 2 and 3 ...
            df_Valve['New_Columns'] = abs(pd.to_datetime(df_Valve['TimeStamp'].iloc[i]) - 
                                         (pd.to_datetime(df_Valve['TimeStamp'].iloc[j]))) 

我的算法输出不正确,如下所示:

         TimeStamp           Sensor_Temp    Sensor_StrainGauge      New_Columns
    2018-01-01 00:00:00       53                0              0 days            
    2018-01-01 00:00:05       66                0              0 days 
    2018-01-01 00:00:07       69                0              0 days 
    2018-01-02 00:00:07       69                1              0 days 
    2018-01-02 00:00:08       69                1              0 days 

1 个答案:

答案 0 :(得分:0)

使用pd.to_datetime 转换为日期时间,然后可以将Series.diffSeries.shift一起使用 为了获得所需的差异,最终fillna已用于 填写最后一个值:

df_Valve['TimeStamp']=pd.to_datetime(df_Valve['TimeStamp'])
df_Valve['New_Columns']=df_Valve['TimeStamp'].diff().shift(-1).fillna(pd.Timedelta(0))
print(df_Valve)

            TimeStamp  Sensor_Temp  Sensor_StrainGauge     New_Columns
0 2018-01-01 00:00:00           53                   0 0 days 00:00:05
1 2018-01-01 00:00:05           66                   0 0 days 00:00:02
2 2018-01-01 00:00:07           69                   0 1 days 00:00:00
3 2018-01-02 00:00:07           69                   1 0 days 00:00:01
4 2018-01-02 00:00:08           69                   1 0 days 00:00:00