Pandas - 合并来自多个列的值

时间:2017-07-20 19:29:54

标签: python pandas dataframe

我有一个DataFrame,当我创建它时,它的值在几列之间分开。我想将第2-4列合并为一个。

fish_frame:                         0       1       2         3  \
0                   735-8     NaN     NaN       NaN   
1                     NaN     NaN     NaN  LIVE WGT   
2                 GBE COD     NaN     NaN       600   
3                 GBW COD     NaN  11,189       NaN   
4                 GOM COD     NaN       0       NaN   
5                 POLLOCK     NaN     NaN     1,103   
6                   WHAKE     NaN     NaN        12   
7             GBE HADDOCK     NaN  10,730       NaN   
8             GBW HADDOCK     NaN  64,147       NaN   
9             GOM HADDOCK     NaN       0       NaN   
10                REDFISH     NaN     NaN         0   
11         WITCH FLOUNDER     NaN     370       NaN   
12                 PLAICE     NaN     NaN       622   
13     GB WINTER FLOUNDER  54,315     NaN       NaN   
14    GOM WINTER FLOUNDER     653     NaN       NaN   
15  SNEMA WINTER FLOUNDER  14,601     NaN       NaN   
16          GB YELLOWTAIL     NaN   1,663       NaN   
17       SNEMA YELLOWTAIL     NaN   1,370       NaN   
18       CCGOM YELLOWTAIL   1,812     NaN       NaN   

                            4      6        package_deal_column Package_Price  
0                         NaN    NaN  Package Deal - $40,753.69           nan  
1                         NaN  TOTAL  Package Deal - $40,753.69           nan  
2                         NaN    NaN  Package Deal - $40,753.69          None  
3                         NaN    NaN  Package Deal - $40,753.69          None  
4   Package Deal - $40,753.69   None  Package Deal - $40,753.69          None  
5                         NaN    NaN  Package Deal - $40,753.69          None  
6                         NaN    NaN  Package Deal - $40,753.69          None  
7                         NaN    NaN  Package Deal - $40,753.69          None  
8                         NaN    NaN  Package Deal - $40,753.69          None  
9                         NaN    NaN  Package Deal - $40,753.69          None  
10                        NaN    NaN  Package Deal - $40,753.69          None  
11                        NaN    NaN  Package Deal - $40,753.69          None  
12                        NaN    NaN  Package Deal - $40,753.69          None  
13                        NaN   None  Package Deal - $40,753.69          None  
14                        NaN   None  Package Deal - $40,753.69          None  
15                        NaN   None  Package Deal - $40,753.69          None  
16                        NaN    NaN  Package Deal - $40,753.69          None  
17                        NaN    NaN  Package Deal - $40,753.69          None  
18                        NaN   None  Package Deal - $40,753.69          None

正如您所看到的,weight值在多个列之间分开。

我按照这个答案的方向(pandas merge columns in same dataframe)尝试了:

pd.DataFrame({'Column 2': pd.concat([fish_frame[1], fish_frame[2], fish_frame[3]])}).sort_index()

但这并没有成功合并它们。我不确定是不是因为我错误地使用了命令,或者因为我的问题更具体。

另外,我并不认为这会解决它,但我确实尝试过:

fish_frame = fish_frame.dropna(axis=1, how='all')但它并没有改变df。

任何帮助解决这个问题都将不胜感激。

fish_frame = fish_frame.set_index(0).stack()做了什么:

fish_frame: 735-8                  package_deal_column    Package Deal - $40,753.69
                       Package_Price                                nan
NaN                    3                                       LIVE WGT
                       6                                          TOTAL
                       package_deal_column    Package Deal - $40,753.69
                       Package_Price                                nan
GBE COD                3                                            600
                       package_deal_column    Package Deal - $40,753.69
GBW COD                2                                         11,189
                       package_deal_column    Package Deal - $40,753.69
GOM COD                2                                              0
                       4                      Package Deal - $40,753.69
                       package_deal_column    Package Deal - $40,753.69
POLLOCK                3                                          1,103
                       package_deal_column    Package Deal - $40,753.69
WHAKE                  3                                             12
                       package_deal_column    Package Deal - $40,753.69
GBE HADDOCK            2                                         10,730
                       package_deal_column    Package Deal - $40,753.69
GBW HADDOCK            2                                         64,147
                       package_deal_column    Package Deal - $40,753.69
GOM HADDOCK            2                                              0
                       package_deal_column    Package Deal - $40,753.69
REDFISH                3                                              0
                       package_deal_column    Package Deal - $40,753.69
WITCH FLOUNDER         2                                            370
                       package_deal_column    Package Deal - $40,753.69
PLAICE                 3                                            622
                       package_deal_column    Package Deal - $40,753.69
GB WINTER FLOUNDER     1                                         54,315
                       package_deal_column    Package Deal - $40,753.69
GOM WINTER FLOUNDER    1                                            653
                       package_deal_column    Package Deal - $40,753.69
SNEMA WINTER FLOUNDER  1                                         14,601
                       package_deal_column    Package Deal - $40,753.69
GB YELLOWTAIL          2                                          1,663
                       package_deal_column    Package Deal - $40,753.69
SNEMA YELLOWTAIL       2                                          1,370
                       package_deal_column    Package Deal - $40,753.69
CCGOM YELLOWTAIL       1                                          1,812
                       package_deal_column    Package Deal - $40,753.69

1 个答案:

答案 0 :(得分:0)

使用set_indexstack

取决于您的列标题的dtype是整数还是str:

df.set_index('0').stack()

OR

df.set_index(0).stack()

输出:

0                       
NaN                    3    LIVE WGT
GBE COD                3         600
GBW COD                2      11,189
GOM COD                2           0
POLLOCK                3       1,103
WHAKE                  3          12
GBE HADDOCK            2      10,730
GBW HADDOCK            2      64,147
GOM HADDOCK            2           0
REDFISH                3           0
WITCH FLOUNDER         2         370
PLAICE                 3         622
GB WINTER FLOUNDER     1      54,315
GOM WINTER FLOUNDER    1         653
SNEMA WINTER FLOUNDER  1      14,601
GB YELLOWTAIL          2       1,663
SNEMA YELLOWTAIL       2       1,370
CCGOM YELLOWTAIL       1       1,812
dtype: object

进一步重命名等..

df.set_index('0').stack().reset_index(name='weights').drop('level_1',axis=1)

输出:

                        0   weights
0                     NaN  LIVE WGT
1                 GBE COD       600
2                 GBW COD    11,189
3                 GOM COD         0
4                 POLLOCK     1,103
5                   WHAKE        12
6             GBE HADDOCK    10,730
7             GBW HADDOCK    64,147
8             GOM HADDOCK         0
9                 REDFISH         0
10         WITCH FLOUNDER       370
11                 PLAICE       622
12     GB WINTER FLOUNDER    54,315
13    GOM WINTER FLOUNDER       653
14  SNEMA WINTER FLOUNDER    14,601
15          GB YELLOWTAIL     1,663
16       SNEMA YELLOWTAIL     1,370
17       CCGOM YELLOWTAIL     1,812