如何基于三列从DataFrame中提取数据?

时间:2019-04-18 02:01:44

标签: python pandas dataframe

我试图根据三列从Dataframe中抽出并提取重复的列。

我试图将三列转换为字典,并存储它们的索引并比较高度。第4行不是唯一删除的。

df['C']=df[["Color1","Color2","Color3"]].stack().apply(tuple)
df = df.duplicated(subset=["Color1","Color2","Color3"], keep=False)


     Height    Color1    Color2    Color3
0    Short      NaN       Blue      Red
1    High       Red       Blue      NaN
2    Medium     Blue       Red      NaN 
3    Short       NaN       NaN      Blue
4    Short       NaN       Red      Blue
5    High        NaN       NaN      NaN

代码的输出应为:

     Height    Color1    Color2    Color3
0    Short      NaN       Blue      Red
1    High       Red       Blue      NaN
2    Medium     Blue       Red      NaN

1 个答案:

答案 0 :(得分:1)

您可以使用drop_duplicates

df.drop_duplicates(subset="Height")