如何从列中删除重复值?
附加预期输出(以excel格式)
State列有4个值为“West Bengal”。只应显示第一个。
答案 0 :(得分:1)
使用loc
和shift
检测行何时更改值,然后我们可以使用布尔掩码将这些行设置为空白:
In [52]:
df = pd.DataFrame({'state':['West Bengal','West Bengal','West Bengal', 'East','East'] , 'amount':[14,25,36,47,58]})
df
Out[52]:
amount state
0 14 West Bengal
1 25 West Bengal
2 36 West Bengal
3 47 East
4 58 East
In [54]:
df.loc[df['state'] == df['state'].shift(), 'state'] = ''
df
Out[54]:
amount state
0 14 West Bengal
1
2
3 47 East
4