我的数据框如下所示。
Key|Direction
:--|-------:
x | Sell
x | Buy
x | BUY
y | Sell
y | Sell
y | Sell
Z | Buy
Z | Buy
a | Buy
a | Sell
我想要做的是创建第三列,其中对于所有相同的键,如果存在买入和卖出该键,则第三列将表示是。如果不是它只是说不。我正在玩groupby,但我发现很难将值重新分配回数据框。这就是我希望最后一列看起来像
Key|Direction |Cross
:--|------- |------
x | Sell | yes
x | Buy | yes
x | BUY | yes
y | Sell | no
y | Sell | no
y | Sell | no
Z | Buy | no
Z | Buy | no
a | Buy | yes
a | Sell | yes
答案 0 :(得分:1)
您可以使用groupby
+ transform
将set
与dict
的{{1}}进行比较:
d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
.transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
另一个为Series
创建set
,map
Series
新列的解决方案,与map
(==
)进行比较,最后一张地图dict
:
d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
与eq
类似的解决方案:
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
答案 1 :(得分:0)
一种方法是首先使用groupby:
df1 = df.groupby('Key',sort=False)['Direction'].apply(', '.join).reset_index()
print(df1)
请注意,您需要将排序设置为False
df1看起来像:
Key Direction
0 x Sell, Buy, Buy
1 y Sell, Sell, Sell
2 Z Buy, Buy
3 a Buy, Sell
然后您只需使用正确数量的“是”'创建新列。或者没有'取决于你有多少把钥匙。
请注意,我们使用split来了解单个键的方向数
cross=[]
for row in df1.index:
elem = df1.ix[row,'Direction']
if Sell in elem and Buy in elem:
for i in range(len(elem.split(','))):
cross.append('yes')
else:
for i in range(len(elem.split(','))):
cross.append('no')
df['Cross'] = pd.Series(cross)
print(df)
输出:
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
PS: 在您的示例中创建数据框时,我添加了更快,所以请考虑到这一点:
Sell='Sell'
Buy='Buy'