我正在尝试为pandas DataFrame构建multiIndex,该数据存储用于几个人的时间序列数据。
我认为实现此目标的一种好方法如下:
D1 = pd.date_range(start='1/1/2018', periods=2, freq='H')
D2 = pd.date_range(start='3/4/2018', periods=3, freq='H')
l1=[1,2] # the individuals' numbers
l2 = [D1,D2]
l = list(zip(l1,l2))
M = pd.MultiIndex.from_tuples(l)
,所需的输出将是以下形式的multiIndex:
1 2018-01-01 00:00:00
2018-01-01 01:00:00
2 2018-03-04 00:00:00
2018-03-04 01:00:00
2018-03-04 02:00:00
但是,我得到TypeError: unhashable type: 'DatetimeIndex'
。任何帮助将不胜感激。
答案 0 :(得分:2)
对于元组列表,解决方案将l2
的第二个压缩值展平:
l = [(a,x) for a, b in zip(l1,l2) for x in b]
print(l)
[(1, Timestamp('2018-01-01 00:00:00', freq='H')),
(1, Timestamp('2018-01-01 01:00:00', freq='H')),
(2, Timestamp('2018-03-04 00:00:00', freq='H')),
(2, Timestamp('2018-03-04 01:00:00', freq='H')),
(2, Timestamp('2018-03-04 02:00:00', freq='H'))]
M = pd.MultiIndex.from_tuples(l)
print(M)
MultiIndex(levels=[[1, 2], [2018-01-01 00:00:00, 2018-01-01 01:00:00,
2018-03-04 00:00:00, 2018-03-04 01:00:00,
2018-03-04 02:00:00]],
codes=[[0, 0, 1, 1, 1], [0, 1, 2, 3, 4]])
s = pd.Series(range(5), index=M)
print (s)
1 2018-01-01 00:00:00 0
2018-01-01 01:00:00 1
2 2018-03-04 00:00:00 2
2018-03-04 01:00:00 3
2018-03-04 02:00:00 4
dtype: int64