df,则
Name
Sri
Sri,Ram
Sri,Ram,kumar
Ram
我正在尝试计算每个值的值计数。 使用
时,我没有得到输出 df["Name"].values_count()
我想要的输出是,
Sri 3
Ram 3
Kumar 1
答案 0 :(得分:4)
split
列,stack
为长格式,然后为count
:
df.Name.str.split(',', expand=True).stack().value_counts()
#Sri 3
#Ram 3
#kumar 1
#dtype: int64
或者也许:
df.Name.str.get_dummies(',').sum()
#Ram 3
#Sri 3
#kumar 1
#dtype: int64
或者在 value_counts 之前连接:
pd.value_counts(pd.np.concatenate(df.Name.str.split(',')))
#Sri 3
#Ram 3
#kumar 1
#dtype: int64
时序:
%timeit df.Name.str.split(',', expand=True).stack().value_counts()
#1000 loops, best of 3: 1.02 ms per loop
%timeit df.Name.str.get_dummies(',').sum()
#1000 loops, best of 3: 1.18 ms per loop
%timeit pd.value_counts(pd.np.concatenate(df.Name.str.split(',')))
#1000 loops, best of 3: 573 µs per loop
# option from @Bharathshetty
from collections import Counter
%timeit pd.Series(Counter((df['Name'].str.strip() + ',').sum().rstrip(',').split(',')))
# 1000 loops, best of 3: 498 µs per loop
# option inspired by @Bharathshetty
%timeit pd.value_counts(df.Name.str.cat(sep=',').split(','))
# 1000 loops, best of 3: 483 µs per loop