我有一个这样的数据框:
Band1 lat1 lon1
latitude level longitude
41.0 1000 19.50 23.0 41.015335 19.548331
19.50 44.0 41.015335 19.565497
19.50 12.0 41.015335 19.582663
19.75 35.0 41.015335 19.668494
19.75 83.0 41.015335 19.685660
我想将以下列添加到multiIndex(这是DatetimeIndex
类型):
DatetimeIndex(['1979-01-01 00:00:00', '1979-01-01 01:00:00',
'1979-01-01 02:00:00', '1979-01-01 03:00:00',
'1979-01-01 04:00:00', '1979-01-01 05:00:00',
'1979-01-01 06:00:00', '1979-01-01 07:00:00',
'1979-01-01 08:00:00', '1979-01-01 09:00:00',
...
'2019-12-30 15:00:00', '2019-12-30 16:00:00',
'2019-12-30 17:00:00', '2019-12-30 18:00:00',
'2019-12-30 19:00:00', '2019-12-30 20:00:00',
'2019-12-30 21:00:00', '2019-12-30 22:00:00',
'2019-12-30 23:00:00', '2019-12-31 00:00:00'],
dtype='datetime64[ns]', length=179305, freq=None)
我尝试了描述here的过程,但是它需要花费数小时的循环而没有结果(可能是由于行数很大-在这种情况下为179305)。 所需的结果将是:
Band1 lat1 lon1
latitude level longitude time
41.0 1000 19.50 '1979-01-01 00:00:00' 23.0 41.015335 19.548331
'1979-01-01 01:00:00' 23.0 41.015335 19.548331
'1979-01-01 02:00:00' 23.0 41.015335 19.548331
'1979-01-01 03:00:00' 23.0 41.015335 19.548331
'1979-01-01 04:00:00' 23.0 41.015335 19.548331
... ... ... ...
19.60 '1979-01-01 00:00:00' 44.0 41.015335 19.565497
'1979-01-01 01:00:00' 44.0 41.015335 19.565497
'1979-01-01 02:00:00' 44.0 41.015335 19.565497
'1979-01-01 03:00:00' 44.0 41.015335 19.565497
'1979-01-01 04:00:00' 44.0 41.015335 19.565497
... ....
19.65 12.0 41.015335 19.582663
19.75 35.0 41.015335 19.668494
19.75 83.0 41.015335 19.685660
... ... ...
46.5 850 23.00 1280.0 46.491333 23.015891
23.00 1390.0 46.491333 23.033057
23.00 1508.0 46.491333 23.050223
23.00 1519.0 46.491333 23.067389
23.00 1544.0 46.491333 23.084556
主要问题是速度,因此for
循环不是一个选择。任何帮助表示赞赏。
答案 0 :(得分:2)
您想要append
中的set_index
选项:
# toy data
idx = pd.MultiIndex.from_arrays([list('aabbcc'), list('111111')], names=['x','y'])
df = pd.DataFrame(np.arange(18).reshape(-1,3),
index=idx,
columns=list('abc'))
times = [11,22]
# calculate multiplicity of the last index
multi = len(df.index)//len(times)
df = (df.assign(time=np.tile(times, multi)) # replace [0,1,2,3,4] with your datetime series
.set_index('time', append=True)
)
输出:
a b c
x y time
a 1 11 0 1 2
22 3 4 5
b 1 11 6 7 8
22 9 10 11
c 1 11 12 13 14
22 15 16 17