如何计算不同的重复项

时间:2019-01-24 14:37:16

标签: python-3.x

我在下面分享了代码; 我想删除重复项并对其进行计数。还希望获得一列计数时间。 显然,代码将计数A列并计数,删除重复项。最后它将添加为新列。有可能吗?

df = pd.DataFrame({“ A”:[“ foo”,“ foo”,“ foo”,“ bar”]})


df = pd.DataFrame({“ A”:[“ foo”,“ bar”],“ B”:[3,1]})

1 个答案:

答案 0 :(得分:0)

虽然完全不使用熊猫,但您可以使用标准馆藏的Counter来实现:

>>> from collections import Counter
>>> Counter(["foo", "foo", "foo", "bar"])
>>> counter = Counter(["foo", "foo", "foo", "bar"])
>>> counter.keys()
dict_keys(['foo', 'bar'])
>>> counter.values()
dict_values([3, 1])

因此,对于您的情况:

counter = Counter(["foo", "foo", "foo", "bar"])
df = pd.DataFrame({"A": list(counter.keys()), "B": list(counter.values())})