我试图找出与R's select
函数等效的熊猫。有一个link的基础知识,但没有给出我想做的指南!
raw_data = {'patient': [1, 1, 1, 2, 2],
'obs': [1, 2, 3, 1, 2],
'treatment': [0, 1, 0, 1, 0],
'score': ['strong', 'weak', 'normal', 'weak', 'strong']}
df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score'])
df.rename(columns = {'treatment':'treat'},inplace=True)
df = df.loc[:, ['treat','score','obs']]
Out[89]:
treat score obs
0 0 strong 1
1 1 weak 2
2 0 normal 3
3 1 weak 1
4 0 strong 2
我们可以使用R's
dplyr
select(df, treat=treatment, score, obs) that's it.
我如何只用一行代码来选择,排序和重命名大熊猫呢?
答案 0 :(得分:1)
在熊猫中没有一种方法可以进行选择和重命名,必须像您的解决方案那样使用类似的方法:
df = df.rename(columns = {'treatment':'treat'})[['treat','score','obs']]
#alternative
#df = df[['treatment','score','obs']].rename(columns = {'treatment':'treat'})
print (df)
treat score obs
0 0 strong 1
1 1 weak 2
2 0 normal 3
3 1 weak 1
4 0 strong 2
答案 1 :(得分:0)
现在你可以在 python 中以 dplyr
的方式做到这一点:
>>> from datar.all import f, tibble, select
>>>
>>> raw_data = tibble(
... patient=[1, 1, 1, 2, 2],
... obs=[1, 2, 3, 1, 2],
... treatment=[0, 1, 0, 1, 0],
... score=['strong', 'weak', 'normal', 'weak', 'strong']
... )
>>> # In python, keyword arguments have to come last
>>> # we specify score=f.score to keep the name unchanged
>>> select(raw_data, treat=f.treatment, score=f.score, obs=f.obs)
treat score obs
<int64> <object> <int64>
0 0 strong 1
1 1 weak 2
2 0 normal 3
3 1 weak 1
4 0 strong 2
我是 datar
包的作者。如果您有任何问题,请随时提交问题。