在Pandas Dataframes中展平列并返回原始版本

时间:2017-04-10 19:29:57

标签: python pandas numpy dataframe

如何在pandas数据帧中展平两列?

例如

Task 1 : 

company-asset  company-debt   wealth  
    GOLD          SILVER      2000.0
    BRONZE        IRON        4000.0 
    IRON          GOLD        1500.0

现在我想要(资产是+,债务是负数)

GOLD   SILVER   BRONZE   IRON
500    -2000   4000    -2500

Task 2:

Now i want to get the original dataframe with rows where value of 
the columns in dataframe 2 is greater than -1000 and less than +1000. 
So in the case above it will only be GOLD therefore we get this DF

company-asset  company-debt   wealth  
    GOLD          SILVER      2000.0
    IRON          GOLD        1500.0

2 个答案:

答案 0 :(得分:4)

试试这个:

s = (df.set_index('wealth').stack()
       .rename('metal')
       .rename_axis(('wealth', 'type'))
       .reset_index()
       .pipe(lambda l: l.assign(wealth=l.wealth.where(l.type.str.endswith('asset'), 
                                                      -l.wealth)))
       .groupby('metal').wealth.sum())
​
s
#metal
#BRONZE    4000.0
#GOLD       500.0
#IRON     -2500.0
#SILVER   -2000.0
#Name: wealth, dtype: float64

metals = s[(s > -1000) & (s < 1000)].index
df[df['company-asset'].isin(metals) | df['company-debt'].isin(metals)]

# company-asset   company-debt  wealth
#0         GOLD         SILVER  2000.0
#2         IRON           GOLD  1500.0

答案 1 :(得分:1)

我不确定你的第一个问题是什么。

以下是第二个问题的答案

import numpy as np
import pandas as pd
dd = np.array([['GOLD', 'SILVER',2000.0],['BRONZE', 'IRON', 4000.0], ['IRON', 'GOLD', 1500.0]])
col = ['company-asset', 'company-debt', 'wealth']
a = pd.DataFrame(data = dd,columns = col)
for i in range (3):
    a.loc[i][2] = float(a.loc[i][2])
a[(a['wealth']>-1000) & (a['wealth'] < 4000)]

这是输出

Out[1]: 
  company-asset company-debt wealth
0          GOLD       SILVER   2000
2          IRON         GOLD   1500