我有一个pandas数据帧df
,如下所示:
1/25/2001 1364.3 1367.35 1354.63
1/24/2001 1360.4 1369.75 1357.28
1/23/2001 1342.9 1362.9 1339.63
我希望将其扩展为res
:
1/26/2001 NaN NaN NaN
1/25/2001 1364.3 1367.35 1354.63
1/24/2001 1360.4 1369.75 1357.28
1/23/2001 1342.9 1362.9 1339.63
1/22/2001 NaN NaN NaN
我尝试如下:
df = pd.read_csv(fi, header=None, sep=',')
print (df)
index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27'))
print (index)
res = df.reindex(index).iloc[::-1]
print (res)
0 1 2 3
2001-01-26 NaN NaN NaN NaN
2001-01-25 NaN NaN NaN NaN
2001-01-24 NaN NaN NaN NaN
2001-01-23 NaN NaN NaN NaN
2001-01-22 NaN NaN NaN NaN
res = pd.DataFrame(df, index=index)
print (res)
它也打印与上面相同。 如何获得预期的资源?
答案 0 :(得分:1)
您可以尝试reindex
index = np.arange(np.datetime64('2001-1-22'), np.datetime64('2001-1-26'))
df=df.reindex(index).iloc[::-1]
答案 1 :(得分:1)
应该有效:
In [89]: df
Out[89]:
0 1 2 3
0 2001-01-25 1364.3 1367.35 1354.63
1 2001-01-24 1360.4 1369.75 1357.28
2 2001-01-23 1342.9 1362.90 1339.63
In [90]: df[0] = pd.to_datetime(df[0])
In [91]: index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27'))
In [92]: index
Out[92]: array(['2001-01-22', '2001-01-23', '2001-01-24', '2001-01-25', '2001-01-26'], dtype='datetime64[D]')
In [106]: df
Out[106]:
1 2 3
0
2001-01-25 1364.3 1367.35 1354.63
2001-01-24 1360.4 1369.75 1357.28
2001-01-23 1342.9 1362.90 1339.63
In [107]: df.reindex(index)
Out[107]:
1 2 3
0
2001-01-22 NaN NaN NaN
2001-01-23 1342.9 1362.90 1339.63
2001-01-24 1360.4 1369.75 1357.28
2001-01-25 1364.3 1367.35 1354.63
2001-01-26 NaN NaN NaN
或艰难的方式:
In [94]: pd.concat([df,pd.Series(index)]).drop_duplicates(0).sort_values(0)
Out[94]:
0 1 2 3
0 2001-01-22 NaN NaN NaN
2 2001-01-23 1342.9 1362.90 1339.63
1 2001-01-24 1360.4 1369.75 1357.28
0 2001-01-25 1364.3 1367.35 1354.63
4 2001-01-26 NaN NaN NaN
答案 2 :(得分:1)
我认为您的解决方案很好,只需要将索引转换为DatetimeIndex
- parse_date=True
和index_col=[0]
或read_csv
或pd.DatetimeIndex
}}:
pd.to_datetime(df.index)
如果想要扩展日期时间,可以通过最大和最小日期更加动态添加df = pd.read_csv(fi, header=None, parse_date=True, index_col=[0])
print (df)
1 2 3
0
2001-01-25 1364.3 1367.35 1354.63
2001-01-24 1360.4 1369.75 1357.28
2001-01-23 1342.9 1362.90 1339.63
print (df.index)
DatetimeIndex(['2001-01-25', '2001-01-24', '2001-01-23'],
dtype='datetime64[ns]', name=0, freq=None)
index = np.arange(np.datetime64('2001-01-22'), np.datetime64('2001-01-27'))
#df.index = pd.DatetimeIndex(df.index)
#alternatvie
#df.index = pd.to_datetime(df.index)
res = df.reindex(index).iloc[::-1]
print (res)
1 2 3
0
2001-01-26 NaN NaN NaN
2001-01-25 1364.3 1367.35 1354.63
2001-01-24 1360.4 1369.75 1357.28
2001-01-23 1342.9 1362.90 1339.63
2001-01-22 NaN NaN NaN
并更改顺序:
Timedelta