如何根据另外两列

时间:2017-06-30 11:28:54

标签: python

我的数据框如下所示。

Key|Direction
:--|-------: 
x  | Sell
x  | Buy
x  | BUY
y  | Sell
y  | Sell
y  | Sell
Z  | Buy
Z  | Buy
a  | Buy
a  | Sell

我想要做的是创建第三列,其中对于所有相同的键,如果存在买入和卖出该键,则第三列将表示是。如果不是它只是说不。我正在玩groupby,但我发现很难将值重新分配回数据框。这就是我希望最后一列看起来像

Key|Direction |Cross
:--|-------   |------
x  | Sell     | yes
x  | Buy      | yes
x  | BUY      | yes
y  | Sell     | no
y  | Sell     | no
y  | Sell     | no
Z  | Buy      | no 
Z  | Buy      | no
a  | Buy      | yes
a  | Sell     | yes

2 个答案:

答案 0 :(得分:1)

您可以使用groupby + transformsetdict的{​​{1}}进行比较:

d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
                .transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

另一个为Series创建setmap Series新列的解决方案,与map==)进行比较,最后一张地图dict

d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

eq类似的解决方案:

s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

答案 1 :(得分:0)

一种方法是首先使用groupby

df1 = df.groupby('Key',sort=False)['Direction'].apply(', '.join).reset_index()
print(df1)

请注意,您需要将排序设置为False

df1看起来像:

  Key         Direction
0   x    Sell, Buy, Buy
1   y  Sell, Sell, Sell
2   Z          Buy, Buy
3   a         Buy, Sell

然后您只需使用正确数量的“是”'创建新列。或者没有'取决于你有多少把钥匙。

请注意,我们使用split来了解单个键的方向数

cross=[]
for row in df1.index:
    elem = df1.ix[row,'Direction']

    if Sell in elem and Buy in elem:
        for i in range(len(elem.split(','))):
            cross.append('yes')
    else:
        for i in range(len(elem.split(','))):
            cross.append('no')

df['Cross'] = pd.Series(cross)
print(df)

输出:

  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

PS: 在您的示例中创建数据框时,我添加了更快,所以请考虑到这一点:

Sell='Sell'
Buy='Buy'