从定义的列向上拆分数据框,同时保留前两列

时间:2015-08-24 08:44:34

标签: python pandas

我有以下数据框:

import pandas as pd
df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'],
                 'Gene' : ['one', 'two','three','four','five'],
                 'X' : randn(5), 'Y' : randn(5)})

看起来像这样:

In [20]: df
Out[20]:
    Gene Probe         X         Y
0    one     a  0.104504  1.089442
1    two     b  0.030071  0.696786
2  three     c  1.224704  1.077867
3   four     d -0.052333  0.034292
4   five     e -0.283872  0.602743

我想要做的是将列X的数据框分开并保留 第一列和第二列产生:

    Gene Probe         X
0    one     a  0.104504
1    two     b  0.030071
2  three     c  1.224704
3   four     d -0.052333
4   five     e -0.283872

    Gene Probe         Y
0    one     a  1.089442
1    two     b  0.696786
2  three     c  1.077867
3   four     d  0.034292
4   five     e  0.602743

我试过这个,但确实给了我的期望:

for dfs in df.groupby(['Probe','Gene']):
    print dfs

这样做的正确方法是什么?

2 个答案:

答案 0 :(得分:1)

这将是一个开始:

df_x = df.loc[:, ['Gene', 'Probe', 'X']]
df_y = df.loc[:, ['Gene', 'Probe', 'Y']]

答案 1 :(得分:1)

您可以使用difference删除您不感兴趣的列以选择列:

In [9]:

X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
    Gene Probe         X
0    one     a  1.231749
1    two     b  0.519425
2  three     c  0.849960
3   four     d -0.077796
4   five     e  1.224163
Out[9]:
    Gene Probe         Y
0    one     a  0.022695
1    two     b  0.500311
2  three     c -0.163624
3   four     d  0.411491
4   five     e  1.305214