我的数据集如下所示。我正在尝试对pandas
数据帧进行子集化,以便仅选择所有3个人的回答。例如,在下面的数据框中,所有3个人回答的回答是“我喜欢吃”和“您今天过得愉快”。因此,只有那些应该被子集。我不确定如何在Pandas
数据框中实现此目标。
注意:我是Python的新手,请提供您的代码说明。
DataFrame示例
import pandas as pd
data = {'Person':['1', '1','1','2','2','2','2','3','3'],'Response':['I like to eat','You have nice day','My name is ','I like to eat','You have nice day','My name is','This is it','I like to eat','You have nice day'],
}
df = pd.DataFrame(data)
print (df)
输出:
Person Response
0 1 I like to eat
1 1 You have nice day
2 1 My name is
3 2 I like to eat
4 2 You have nice day
5 2 My name is
6 2 This is it
7 3 I like to eat
8 3 You have nice day
答案 0 :(得分:1)
IIUC我正在将transform
与nunique
一起使用
yourdf=df[df.groupby('Response').Person.transform('nunique')==df.Person.nunique()]
yourdf
Out[463]:
Person Response
0 1 I like to eat
1 1 You have nice day
3 2 I like to eat
4 2 You have nice day
7 3 I like to eat
8 3 You have nice day
方法2
df.groupby('Response').filter(lambda x : pd.Series(df['Person'].unique()).isin(x['Person']).all())
Out[467]:
Person Response
0 1 I like to eat
1 1 You have nice day
3 2 I like to eat
4 2 You have nice day
7 3 I like to eat
8 3 You have nice day