我有一个如下数据框:
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-04 0.0 19-12-05 1.7 19-12-05 2.6 19-12-06 3.2 19-12-05 0.1
1 19-12-03 1.5 19-12-04 1.7 19-12-04 2.6 19-12-05 3.2 19-12-04 0.1
2 19-12-02 1.5 19-12-03 1.7 19-12-03 2.6 19-12-04 3.1 19-12-03 0.1
3 19-12-01 1.5 19-12-02 1.8 19-12-02 2.7 19-12-03 3.2 19-12-02 0.1
4 19-11-29 1.5 19-12-01 1.7 19-11-29 2.6 19-12-02 3.2 19-12-01 0.1
5 19-11-28 1.6 19-11-29 1.7 19-11-28 2.6 19-11-29 3.1 19-11-29 0.1
6 19-11-27 1.6 19-11-28 1.7 19-11-27 2.6 19-11-28 3.2 19-11-28 0.1
7 19-11-26 1.6 19-11-27 1.7 19-11-26 2.6 19-11-27 3.2 19-11-27 0.2
8 19-11-25 1.5 19-11-26 1.7 19-11-25 2.6 19-11-26 3.2 19-11-26 0.2
9 19-11-24 1.5 19-11-25 1.7 19-11-22 2.6 19-11-25 3.2 19-11-25 0.2
10 19-11-22 1.5 19-11-24 1.7 19-11-21 2.6 19-11-22 3.1 19-11-24 0.2
每个“国家/地区”列具有不同的行值。 现在,我想按日期匹配和重新排列值。我想用“?”代替空白标记。我想要的结果如下:
A Country price1 A Country price2 B Country price1 B Country price2 C Country price1
0 19-12-06 ? 19-12-06 ? 19-12-06 ? 19-12-06 3.2 19-12-06 ?
1 19-12-05 ? 19-12-05 1.7 19-12-05 2.6 19-12-05 3.2 19-12-05 0.1
2 19-12-04 0.0 19-12-04 1.7 19-12-04 2.6 19-12-04 3.1 19-12-04 0.1
3 19-12-03 1.5 19-12-03 1.7 19-12-03 2.6 19-12-03 3.2 19-12-03 0.1
4 19-12-02 1.5 19-12-02 1.8 19-12-02 2.7 19-12-02 3.2 19-12-02 0.1
5 19-12-01 1.5 19-12-01 1.7 19-12-01 ? 19-12-01 ? 19-12-01 0.1
6 19-11-29 1.5 19-11-29 1.7 19-11-29 2.6 19-11-29 3.1 19-11-29 0.1
7 19-11-28 1.6 19-11-28 1.7 19-11-28 2.6 19-11-28 3.2 19-11-28 0.1
8 19-11-27 1.6 19-11-27 1.7 19-11-27 2.6 19-11-27 3.2 19-11-27 0.2
9 19-11-26 1.6 19-11-26 1.7 19-11-26 2.6 19-11-26 3.2 19-11-26 0.2
10 19-11-25 1.5 19-11-25 1.7 19-11-25 2.6 19-11-25 3.2 19-11-25 0.2
11 19-11-24 1.5 19-11-24 1.7 19-11-24 ? 19-11-24 ? 19-11-24 0.2
12 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ? 19-11-23 ?
13 19-11-22 1.5 19-11-22 ? 19-11-22 2.6 19-11-22 3.1 19-11-22 ?
14 19-11-21 ? 19-11-21 ? 19-11-21 2.6 19-11-21 ? 19-11-21 ?
抱歉,我是编码方面的新手。列名对我来说并不重要, 因此,我想要的替代结果是:
A Country price1 price2 price1 price2 price1
0 19-12-06 ? ? ? 3.2 ?
1 19-12-05 ? 1.7 2.6 3.2 0.1
2 19-12-04 0.0 1.7 2.6 3.1 0.1
3 19-12-03 1.5 1.7 2.6 3.2 0.1
4 19-12-02 1.5 1.8 2.7 3.2 0.1
5 19-12-01 1.5 1.7 ? ? 0.1
6 19-11-29 1.5 1.7 2.6 3.1 0.1
7 19-11-28 1.6 1.7 2.6 3.2 0.1
8 19-11-27 1.6 1.7 2.6 3.2 0.2
9 19-11-26 1.6 1.7 2.6 3.2 0.2
10 19-11-25 1.5 1.7 2.6 3.2 0.2
11 19-11-24 1.5 1.7 ? ? 0.2
12 19-11-23 ? ? ? ? ?
13 19-11-22 1.5 ? 2.6 3.1 ?
14 19-11-21 ? ? 2.6 ? ?
我该如何实现?
答案 0 :(得分:1)
想法是对每个成对和不成对的列进行压缩,并在列表理解中按第一列创建索引,最后按concat
连接在一起并创建DatetimeIndex
a = df.columns[::2]
b = df.columns[1::2]
dfs = [df.loc[:, x].set_index(x[0], drop=False)[x[1]] for x in zip(a, b)]
df = pd.concat(dfs, axis=1, sort=False).fillna('?')
df.index = pd.to_datetime(df.index,format='%y-%m-%d')
df = df.sort_index()
print (df)
price1 price2 price1.1 price2.1 price1.2
2019-11-21 ? ? 2.6 ? ?
2019-11-22 1.5 ? 2.6 3.1 ?
2019-11-24 1.5 1.7 ? ? 0.2
2019-11-25 1.5 1.7 2.6 3.2 0.2
2019-11-26 1.6 1.7 2.6 3.2 0.2
2019-11-27 1.6 1.7 2.6 3.2 0.2
2019-11-28 1.6 1.7 2.6 3.2 0.1
2019-11-29 1.5 1.7 2.6 3.1 0.1
2019-12-01 1.5 1.7 ? ? 0.1
2019-12-02 1.5 1.8 2.7 3.2 0.1
2019-12-03 1.5 1.7 2.6 3.2 0.1
2019-12-04 0 1.7 2.6 3.1 0.1
2019-12-05 ? 1.7 2.6 3.2 0.1
2019-12-06 ? ? ? 3.2 ?