在熊猫数据框中匹配和重新排列值

时间:2019-12-09 12:19:44

标签: pandas dataframe

我有一个如下数据框:

    A Country  price1 A Country  price2 B Country  price1 B Country  price2 C Country  price1
0   19-12-04   0.0    19-12-05   1.7    19-12-05   2.6    19-12-06   3.2    19-12-05   0.1
1   19-12-03   1.5    19-12-04   1.7    19-12-04   2.6    19-12-05   3.2    19-12-04   0.1 
2   19-12-02   1.5    19-12-03   1.7    19-12-03   2.6    19-12-04   3.1    19-12-03   0.1
3   19-12-01   1.5    19-12-02   1.8    19-12-02   2.7    19-12-03   3.2    19-12-02   0.1
4   19-11-29   1.5    19-12-01   1.7    19-11-29   2.6    19-12-02   3.2    19-12-01   0.1
5   19-11-28   1.6    19-11-29   1.7    19-11-28   2.6    19-11-29   3.1    19-11-29   0.1
6   19-11-27   1.6    19-11-28   1.7    19-11-27   2.6    19-11-28   3.2    19-11-28   0.1
7   19-11-26   1.6    19-11-27   1.7    19-11-26   2.6    19-11-27   3.2    19-11-27   0.2
8   19-11-25   1.5    19-11-26   1.7    19-11-25   2.6    19-11-26   3.2    19-11-26   0.2
9   19-11-24   1.5    19-11-25   1.7    19-11-22   2.6    19-11-25   3.2    19-11-25   0.2
10  19-11-22   1.5    19-11-24   1.7    19-11-21   2.6    19-11-22   3.1    19-11-24   0.2

每个“国家/地区”列具有不同的行值。 现在,我想按日期匹配和重新排列值。我想用“?”代替空白标记。我想要的结果如下:

    A Country  price1 A Country  price2 B Country  price1 B Country  price2 C Country  price1
0   19-12-06   ?      19-12-06   ?      19-12-06   ?      19-12-06   3.2    19-12-06   ?  
1   19-12-05   ?      19-12-05   1.7    19-12-05   2.6    19-12-05   3.2    19-12-05   0.1
2   19-12-04   0.0    19-12-04   1.7    19-12-04   2.6    19-12-04   3.1    19-12-04   0.1
3   19-12-03   1.5    19-12-03   1.7    19-12-03   2.6    19-12-03   3.2    19-12-03   0.1 
4   19-12-02   1.5    19-12-02   1.8    19-12-02   2.7    19-12-02   3.2    19-12-02   0.1
5   19-12-01   1.5    19-12-01   1.7    19-12-01   ?      19-12-01   ?      19-12-01   0.1
6   19-11-29   1.5    19-11-29   1.7    19-11-29   2.6    19-11-29   3.1    19-11-29   0.1
7   19-11-28   1.6    19-11-28   1.7    19-11-28   2.6    19-11-28   3.2    19-11-28   0.1
8   19-11-27   1.6    19-11-27   1.7    19-11-27   2.6    19-11-27   3.2    19-11-27   0.2
9   19-11-26   1.6    19-11-26   1.7    19-11-26   2.6    19-11-26   3.2    19-11-26   0.2
10  19-11-25   1.5    19-11-25   1.7    19-11-25   2.6    19-11-25   3.2    19-11-25   0.2
11  19-11-24   1.5    19-11-24   1.7    19-11-24   ?      19-11-24   ?      19-11-24   0.2
12  19-11-23   ?      19-11-23   ?      19-11-23   ?      19-11-23   ?      19-11-23   ?
13  19-11-22   1.5    19-11-22   ?      19-11-22   2.6    19-11-22   3.1    19-11-22   ?
14  19-11-21   ?      19-11-21   ?      19-11-21   2.6    19-11-21   ?      19-11-21   ?

抱歉,我是编码方面的新手。列名对我来说并不重要, 因此,我想要的替代结果是:

    A Country  price1 price2 price1 price2 price1
0   19-12-06   ?      ?      ?      3.2    ?  
1   19-12-05   ?      1.7    2.6    3.2    0.1
2   19-12-04   0.0    1.7    2.6    3.1    0.1
3   19-12-03   1.5    1.7    2.6    3.2    0.1 
4   19-12-02   1.5    1.8    2.7    3.2    0.1
5   19-12-01   1.5    1.7    ?      ?      0.1
6   19-11-29   1.5    1.7    2.6    3.1    0.1
7   19-11-28   1.6    1.7    2.6    3.2    0.1
8   19-11-27   1.6    1.7    2.6    3.2    0.2
9   19-11-26   1.6    1.7    2.6    3.2    0.2
10  19-11-25   1.5    1.7    2.6    3.2    0.2
11  19-11-24   1.5    1.7    ?      ?      0.2
12  19-11-23   ?      ?      ?      ?      ?
13  19-11-22   1.5    ?      2.6    3.1    ?
14  19-11-21   ?      ?      2.6    ?      ?

我该如何实现?

1 个答案:

答案 0 :(得分:1)

想法是对每个成对和不成对的列进行压缩,并在列表理解中按第一列创建索引,最后按concat连接在一起并创建DatetimeIndex

a = df.columns[::2]
b = df.columns[1::2]
dfs = [df.loc[:, x].set_index(x[0], drop=False)[x[1]] for x in zip(a, b)]
df = pd.concat(dfs, axis=1, sort=False).fillna('?')
df.index = pd.to_datetime(df.index,format='%y-%m-%d')
df = df.sort_index()
print (df)

           price1 price2 price1.1 price2.1 price1.2
2019-11-21      ?      ?      2.6        ?        ?
2019-11-22    1.5      ?      2.6      3.1        ?
2019-11-24    1.5    1.7        ?        ?      0.2
2019-11-25    1.5    1.7      2.6      3.2      0.2
2019-11-26    1.6    1.7      2.6      3.2      0.2
2019-11-27    1.6    1.7      2.6      3.2      0.2
2019-11-28    1.6    1.7      2.6      3.2      0.1
2019-11-29    1.5    1.7      2.6      3.1      0.1
2019-12-01    1.5    1.7        ?        ?      0.1
2019-12-02    1.5    1.8      2.7      3.2      0.1
2019-12-03    1.5    1.7      2.6      3.2      0.1
2019-12-04      0    1.7      2.6      3.1      0.1
2019-12-05      ?    1.7      2.6      3.2      0.1
2019-12-06      ?      ?        ?      3.2        ?