在熊猫中选择,排序和重命名列

时间:2019-10-14 05:27:01

标签: python pandas dplyr

我试图找出与R's select函数等效的熊猫。有一个link的基础知识,但没有给出我想做的指南!

raw_data = {'patient': [1, 1, 1, 2, 2],
        'obs': [1, 2, 3, 1, 2],
        'treatment': [0, 1, 0, 1, 0],
        'score': ['strong', 'weak', 'normal', 'weak', 'strong']}
df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])


df.rename(columns = {'treatment':'treat'},inplace=True)


df = df.loc[:, ['treat','score','obs']]

Out[89]: 
   treat   score  obs
0      0  strong    1
1      1    weak    2
2      0  normal    3
3      1    weak    1
4      0  strong    2

我们可以使用R's dplyr

select(df, treat=treatment, score, obs)  that's it. 

我如何只用一行代码来选择,排序和重命名大熊猫呢?

2 个答案:

答案 0 :(得分:1)

在熊猫中没有一种方法可以进行选择和重命名,必须像您的解决方案那样使用类似的方法:

df = df.rename(columns = {'treatment':'treat'})[['treat','score','obs']]
#alternative
#df = df[['treatment','score','obs']].rename(columns = {'treatment':'treat'})
print (df)
   treat   score  obs
0      0  strong    1
1      1    weak    2
2      0  normal    3
3      1    weak    1
4      0  strong    2

答案 1 :(得分:0)

现在你可以在 python 中以 dplyr 的方式做到这一点:

>>> from datar.all import f, tibble, select
>>> 
>>> raw_data = tibble(
...     patient=[1, 1, 1, 2, 2],
...     obs=[1, 2, 3, 1, 2],
...     treatment=[0, 1, 0, 1, 0],
...     score=['strong', 'weak', 'normal', 'weak', 'strong']
... )
>>> # In python, keyword arguments have to come last
>>> # we specify score=f.score to keep the name unchanged
>>> select(raw_data, treat=f.treatment, score=f.score, obs=f.obs)
    treat    score     obs
  <int64> <object> <int64>
0       0   strong       1
1       1     weak       2
2       0   normal       3
3       1     weak       1
4       0   strong       2

我是 datar 包的作者。如果您有任何问题,请随时提交问题。