如何填写pandas中日期的时间列的缺失时间戳

时间:2018-03-09 06:18:12

标签: python pandas time-series

我有一个时间序列数据如下:

columns.Add(c => c.SrfId).Format("{0:00}").Titled("SRF ID"); 

现在我必须将缺失的时间戳填充到print(df) ric datel timel val 0 xyz 2017-01-01 09:00:00 2 1 xyz 2017-01-01 09:04:00 5 2 xyz 2017-01-01 09:37:00 6

预期产出:

09:45:00

我尝试了什么

    ric     datel       timel        val
0   xyz     2017-01-01  09:00:00     2
1   xyz     2017-01-01  09:01:00     nan
2   xyz     2017-01-01  09:02:00     nan
3   xyz     2017-01-01  09:03:00     nan
4   xyz     2017-01-01  09:04:00     5
...
...
37  xyz     2017-01-01  09:37:00      6
...
...
45  xyz     2017-01-01  09:45:00      nan

,输出为:

df1=df.resample("1 min", on ='datel').first()

还尝试使用 ric datel timel val datel 2017-01-01 xyz 2017-01-01 09:00:00 2 ,但它主要适用于datetime列。 我有两个不同的日期和时间列。有没有办法在不将日期和列合并到datetime的情况下实现这一目标?

2 个答案:

答案 0 :(得分:3)

主要想法是reindex创建的time使用date_range

df['timel'] = pd.to_datetime(df['timel']).dt.time
start = pd.to_datetime(str(df['timel'].min()))
end = pd.to_datetime('09:45:00')
dates = pd.date_range(start=start, end=end, freq='1Min').time
#print (dates)

df = df.set_index('timel').reindex(dates).reset_index().reindex(columns=df.columns)
cols = df.columns.difference(['val'])
df[cols] = df[cols].ffill()
print (df.head())
   ric       datel     timel  val
0  xyz  2017-01-01  09:00:00  2.0
1  xyz  2017-01-01  09:01:00  NaN
2  xyz  2017-01-01  09:02:00  NaN
3  xyz  2017-01-01  09:03:00  NaN
4  xyz  2017-01-01  09:04:00  5.0

resample类似的解决方案:

df['timel'] = pd.to_datetime(df['timel'])

#if missing row with 09:45:00 add it
if not (df['timel']  == pd.to_datetime('09:45:00')).any():
    df.loc[len(df.index), 'timel'] = pd.to_datetime('09:45:00')

df=df.set_index('timel').resample("1min").first().reset_index().reindex(columns=df.columns)
cols = df.columns.difference(['val'])
df[cols] = df[cols].ffill()
df['timel'] = df['timel'].dt.time
print (df.head())
   ric       datel     timel  val
0  xyz  2017-01-01  09:00:00  2.0
1  xyz  2017-01-01  09:01:00  NaN
2  xyz  2017-01-01  09:02:00  NaN
3  xyz  2017-01-01  09:03:00  NaN
4  xyz  2017-01-01  09:04:00  5.0

答案 1 :(得分:0)

使用date_range生成日期后,您可以使用与下面类似的功能将其拆分。

返回值可以输入到df

来自datetime import datetime

def split_datetime(date_with_time):
    """
    This function will return date and time from datetime input
    """
    date_with_time = date_with_time.split(' ')
    date = date_with_time[0]
    time = date_with_time[1].split('.')[0]
    return date, time

#Eg:                   
date, time = split_datetime(str(datetime.now()))