我的df看起来像这样:
code date type strike settlement
id
1195001 CBT_21_G2012_S 2012-01-04 P 101.50 0.015625
1195093 CBT_21_G2012_S 2012-01-04 C 101.50 28.890625
1194926 CBT_21_G2012_S 2012-01-04 C 102.00 28.390625
1194944 CBT_21_G2012_S 2012-01-04 C 102.50 27.906250
1195109 CBT_21_G2012_S 2012-01-04 P 102.50 0.015625
1194905 CBT_21_G2012_S 2012-01-04 C 103.00 27.406250
1195008 CBT_21_G2012_S 2012-01-04 P 103.50 0.015625
1195123 CBT_21_G2012_S 2012-01-04 C 103.50 26.906250
1194908 CBT_21_G2012_S 2012-01-04 C 104.00 26.390625
1194980 CBT_21_G2012_S 2012-01-04 C 104.50 25.890625
1195025 CBT_21_G2012_S 2012-01-04 P 104.50 0.015625
1194981 CBT_21_G2012_S 2012-01-04 P 105.00 0.015625
1195063 CBT_21_G2012_S 2012-01-04 C 105.00 25.390625
1194960 CBT_21_G2012_S 2012-01-04 C 105.50 24.890625
1195102 CBT_21_G2012_S 2012-01-04 P 105.50 0.015625
1194989 CBT_21_G2012_S 2012-01-04 C 106.00 24.390625
我需要找到对于相同的代码,日期和警告仅存在type =='P'或type =='C'的行。
所需的输出应为:
code date type strike settlement
id
1194926 CBT_21_G2012_S 2012-01-04 C 102.00 28.390625
1194905 CBT_21_G2012_S 2012-01-04 C 103.00 27.406250
1194908 CBT_21_G2012_S 2012-01-04 C 104.00 26.390625
1194989 CBT_21_G2012_S 2012-01-04 C 106.00 24.390625
[编辑] 另外,如何在生成的df中翻转“类型”“ C”和“ P”(用“ P”替换“ C”,用“ C”替换“ P”)?
任何帮助都会受到赞赏。
谢谢。
答案 0 :(得分:1)
将transform
与nunique
一起使用,并按1
与eq
(==
)进行比较,最后按boolean indexing
进行过滤:
#if exist multiple types
#df = df[df['type'].isin(['C','P'])]
df = df[df.groupby(['code', 'date', 'strike'])['type'].transform('nunique').eq(1)]
print (df)
code date type strike settlement
id
1194926 CBT_21_G2012_S 2012-01-04 C 102.0 28.390625
1194905 CBT_21_G2012_S 2012-01-04 C 103.0 27.406250
1194908 CBT_21_G2012_S 2012-01-04 C 104.0 26.390625
1194989 CBT_21_G2012_S 2012-01-04 C 106.0 24.390625
详细信息:
print (df.groupby(['code', 'date', 'strike'])['type'].transform('nunique'))
id
1195001 2
1195093 2
1194926 1
1194944 2
1195109 2
1194905 1
1195008 2
1195123 2
1194908 1
1194980 2
1195025 2
1194981 2
1195063 2
1194960 2
1195102 2
1194989 1
Name: type, dtype: int64
编辑:对于交换值,请按字典使用map
:
df['type'] = df['type'].map({'C':'P', 'P':'C'})
print (df)
code date type strike settlement
id
1194926 CBT_21_G2012_S 2012-01-04 P 102.0 28.390625
1194905 CBT_21_G2012_S 2012-01-04 P 103.0 27.406250
1194908 CBT_21_G2012_S 2012-01-04 P 104.0 26.390625
1194989 CBT_21_G2012_S 2012-01-04 P 106.0 24.390625