如何删除熊猫数据框中的唯一行

时间:2019-02-13 14:19:54

标签: python pandas

index                                            SUBJECT
1                                                   test
2                                                  Hello
3                                                  Hello
4                               PRC review - phone calls

删除后

index                                            SUBJECT
2                                                  Hello
3                                                  Hello

我只想删除基于“ SUBJECT”列的行。 该怎么做?

4 个答案:

答案 0 :(得分:4)

使用duplicated

例如:

import pandas as pd

df = pd.DataFrame({"SUBJECT": ["test", "Hello", "Hello", "PRC review - phone calls"]})
df = df[df.duplicated(subset=["SUBJECT"], keep=False)]
print(df)

输出:

  SUBJECT
1   Hello
2   Hello

答案 1 :(得分:1)

您可以这样做:

# get count for each value
s = df.SUBJECT.value_counts()

# get only those that appear more than once
repeated = set(s[s > 1].index.values)

# filter the data-frame base
result = df[df.SUBJECT.isin(repeated)]

print(result)

输出

   index SUBJECT
1      2   Hello
2      3   Hello

答案 2 :(得分:1)

检查此:

df.loc[(df.groupby('SUBJECT').count()>1).sum(axis=1),:]

答案 3 :(得分:1)

解决方案1:

使用loc ..

     lcd.clear();

解决方案2:

分组依据 + 转换 ..

的另一种方法
>>> df.loc[df.duplicated(keep=False), :]
  SUBJECT
1   Hello
2   Hello