使用日期时间扩展pandas数据框

时间:2018-04-06 04:11:20

标签: python-3.x pandas

我有一个pandas数据帧df,如下所示:

1/25/2001   1364.3  1367.35 1354.63
1/24/2001   1360.4  1369.75 1357.28
1/23/2001   1342.9  1362.9  1339.63

我希望将其扩展为res

1/26/2001   NaN     NaN     NaN
1/25/2001   1364.3  1367.35 1354.63
1/24/2001   1360.4  1369.75 1357.28
1/23/2001   1342.9  1362.9  1339.63
1/22/2001   NaN     NaN     NaN

我尝试如下:

df = pd.read_csv(fi, header=None, sep=',')
print (df)
index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27'))
print (index)
res = df.reindex(index).iloc[::-1]
print (res)

              0   1   2   3
2001-01-26  NaN NaN NaN NaN
2001-01-25  NaN NaN NaN NaN
2001-01-24  NaN NaN NaN NaN
2001-01-23  NaN NaN NaN NaN
2001-01-22  NaN NaN NaN NaN


res = pd.DataFrame(df, index=index)
print (res)

它也打印与上面相同。 如何获得预期的资源?

3 个答案:

答案 0 :(得分:1)

您可以尝试reindex

index = np.arange(np.datetime64('2001-1-22'), np.datetime64('2001-1-26'))

df=df.reindex(index).iloc[::-1]

答案 1 :(得分:1)

应该有效:

In [89]: df
Out[89]: 
           0       1        2        3
0 2001-01-25  1364.3  1367.35  1354.63
1 2001-01-24  1360.4  1369.75  1357.28
2 2001-01-23  1342.9  1362.90  1339.63

In [90]: df[0] = pd.to_datetime(df[0])
In [91]: index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27'))

In [92]: index
Out[92]: array(['2001-01-22', '2001-01-23', '2001-01-24', '2001-01-25', '2001-01-26'], dtype='datetime64[D]')

In [106]: df
Out[106]: 
                 1        2        3
0                                   
2001-01-25  1364.3  1367.35  1354.63
2001-01-24  1360.4  1369.75  1357.28
2001-01-23  1342.9  1362.90  1339.63

In [107]: df.reindex(index)
Out[107]: 
                 1        2        3
0                                   
2001-01-22     NaN      NaN      NaN
2001-01-23  1342.9  1362.90  1339.63
2001-01-24  1360.4  1369.75  1357.28
2001-01-25  1364.3  1367.35  1354.63
2001-01-26     NaN      NaN      NaN

或艰难的方式:

In [94]: pd.concat([df,pd.Series(index)]).drop_duplicates(0).sort_values(0)
Out[94]: 
           0       1        2        3
0 2001-01-22     NaN      NaN      NaN
2 2001-01-23  1342.9  1362.90  1339.63
1 2001-01-24  1360.4  1369.75  1357.28
0 2001-01-25  1364.3  1367.35  1354.63
4 2001-01-26     NaN      NaN      NaN

答案 2 :(得分:1)

我认为您的解决方案很好,只需要将索引转换为DatetimeIndex - parse_date=Trueindex_col=[0]read_csvpd.DatetimeIndex }}:

pd.to_datetime(df.index)

如果想要扩展日期时间,可以通过最大和最小日期更加动态添加df = pd.read_csv(fi, header=None, parse_date=True, index_col=[0]) print (df) 1 2 3 0 2001-01-25 1364.3 1367.35 1354.63 2001-01-24 1360.4 1369.75 1357.28 2001-01-23 1342.9 1362.90 1339.63 print (df.index) DatetimeIndex(['2001-01-25', '2001-01-24', '2001-01-23'], dtype='datetime64[ns]', name=0, freq=None) index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27')) #df.index = pd.DatetimeIndex(df.index) #alternatvie #df.index = pd.to_datetime(df.index) res = df.reindex(index).iloc[::-1] print (res) 1 2 3 0 2001-01-26 NaN NaN NaN 2001-01-25 1364.3 1367.35 1354.63 2001-01-24 1360.4 1369.75 1357.28 2001-01-23 1342.9 1362.90 1339.63 2001-01-22 NaN NaN NaN 并更改顺序:

Timedelta