Question

我有一个像下面这样的数据集，我想删除具有相同值的数据行：

enter image description here

我想我可以检查所有行的值，如果所有行都重复然后将其删除，或者我可以指定具有特定时间的行（在这种情况下为12:30），但是我不知道如何编写代码...

我尝试了以下操作，并尝试仅删除一行，但是失败了。

df.drop ['2020-01-29 12:30']

有人可以给我推吗？预先感谢！

Answer 1

嗨，让我知道这是否对您有用，

例如，我已经创建了数据框

import pandas as pd

data1={'A':[1,2,3,43],
    'B':[11,22,3,53],
    'C':[21,23,3,433],
    'D':[131,223,3,54]}

df=pd.DataFrame(data1)
df.index.names=['index']
print(df)

DataFrame

       A    B   C   D
index               
0      1    11  21  131
1      2    22  23  223
2      3    3   3   3
3      43   53  433 54

ind=df[df['A']+df['B'] == df['C']+df['D']].index # get the index where values are similar. Here i have done the addition of the values from first two columns and same with next two columns, if both sums are equal then get the index.

df.drop(ind,inplace=True)  #drop row (ind=2) and save the dataframe 
print(df)

最终输出

       A    B   C   D
index               
0      1    11  21  131
1      2    22  23  223
3      43   53  433 54

注意：索引2行已删除。

熊猫数据框删除具有相同值的数据行

1 个答案: