想要扩展数据帧行以重复列表列:
import pandas as pd, numpy as np
s = ['01NOV2017', '02NOV2017']
df = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'), index=range(6))
这样......
A B C D
0 -1.451528 -1.665262 1.425986 -0.032988
1 1.376609 -0.337819 -0.513632 -0.595584
2 0.520186 -0.019358 -0.403923 0.713807
3 0.553661 0.682552 1.312556 0.966446
4 1.269042 2.034769 0.574845 0.846175
5 0.007470 1.434704 0.173193 0.895777
...变为:
Date A B C D
01Nov2017 0 -1.451528 -1.665262 1.425986 -0.032988
01Nov2017 1 1.376609 -0.337819 -0.513632 -0.595584
01Nov2017 2 0.520186 -0.019358 -0.403923 0.713807
01Nov2017 3 0.553661 0.682552 1.312556 0.966446
01Nov2017 4 1.269042 2.034769 0.574845 0.846175
01Nov2017 5 0.007470 1.434704 0.173193 0.895777
02Nov2017 0 -1.451528 -1.665262 1.425986 -0.032988
02Nov2017 1 1.376609 -0.337819 -0.513632 -0.595584
...
这怎么可能?
答案 0 :(得分:3)
使用concat
:
df = pd.concat([df] * len(s), keys=s)
print (df)
A B C D
01NOV2017 0 1.130177 -0.888353 0.316773 -0.434137
1 1.629171 1.947267 -0.415701 -0.620040
2 -0.629012 1.357567 -1.966725 0.480601
3 -2.154263 -1.185177 0.261690 0.188716
4 2.117664 0.416418 0.339006 -0.643895
5 1.933276 0.282515 0.859852 -0.448571
02NOV2017 0 1.130177 -0.888353 0.316773 -0.434137
1 1.629171 1.947267 -0.415701 -0.620040
2 -0.629012 1.357567 -1.966725 0.480601
3 -2.154263 -1.185177 0.261690 0.188716
4 2.117664 0.416418 0.339006 -0.643895
5 1.933276 0.282515 0.859852 -0.448571
编辑:
df1 = pd.concat([df] * len(s), ignore_index=True)
df1.insert(0, 'Date', np.repeat(s, len(df)))
print (df1)
Date A B C D
0 01NOV2017 -0.489019 1.076954 -0.616073 1.271138
1 01NOV2017 0.758143 0.009106 -1.115460 -0.355548
2 01NOV2017 -0.025088 -0.147855 -0.303579 2.120897
3 01NOV2017 -0.898241 -0.231282 1.100928 -1.519086
4 01NOV2017 0.078057 -0.145468 -0.092385 -0.824499
5 01NOV2017 0.512102 -2.443919 -0.932585 0.088907
6 02NOV2017 -0.489019 1.076954 -0.616073 1.271138
7 02NOV2017 0.758143 0.009106 -1.115460 -0.355548
8 02NOV2017 -0.025088 -0.147855 -0.303579 2.120897
9 02NOV2017 -0.898241 -0.231282 1.100928 -1.519086
10 02NOV2017 0.078057 -0.145468 -0.092385 -0.824499
11 02NOV2017 0.512102 -2.443919 -0.932585 0.088907
答案 1 :(得分:0)
有兴趣,这是使用交叉连接的替代方法:
# define temp key for cross join
dates=pd.DataFrame({'Date': s, 'tmp_key': [1,1]})
df['tmp_key']=1
# get index as column
df.reset_index(inplace=True)
#merge
df=df.merge(dates, how='outer', on='tmp_key')
df.drop(labels='tmp_key', axis=1, inplace=True)
df.set_index(keys='Date', drop=True, inplace=True)