我是python / pandas的新手,但有一个问题,我会合乎逻辑地提出以帮助我学习
我有一个称为方的数据框,其中包含以下数据
(index) name invitees
0 birthday party [mike, peter]
1 Retirement [peter]
2 office opening [simon, mike, peter]
我希望能够创建一个字典,该字典将显示被邀请者列中不同的名称以及频率,例如像这样
mike: 2, peter: 3, simon: 1
我试图在此处找到类似的内容,但我不太确定要使用的正确术语。
任何帮助将不胜感激 非常感谢
答案 0 :(得分:0)
您可以使用集合中的this custom hook和itertools中的Counter来解决问题:
from collections import Counter
from itertools import chain
df2= pd.DataFrame({
'name':["blah", "blah-blah", "waka-waka"],
'invites':[['mike', 'peter'], ['peter', 'mike'], ['waka', 'peter', 'simon']]
})
Counter([elem for elem in chain.from_iterable(df2['invites'].values)])
Counter({'mike': 2, 'peter': 3, 'simon': 1, 'waka': 1})
答案 1 :(得分:0)
从数据框中收集名称,然后使用“计数器”:
from collections import Counter
import pandas as pd
# setup test data
data = {'invitees': [['mike', 'peter'], ['peter'], ['simon', 'mike', 'peter']]}
data = pd.DataFrame(data=data)
# select data series
names_lists = data['invitees']
# collect names
all_names = []
for item in names_lists:
for name in item:
all_names.append(name)
# count occurrence
summary = Counter(all_names)
输出:
{'peter': 3, 'mike': 2, 'simon': 1}
答案 2 :(得分:0)
from collections import Counter
invitees = [["mike", "peter"],["peter"],["simon", "mike", "peter"]]
name = ["birthday party","Retirement","office opening"]
new_df = pd.DataFrame(data={"name":name,"invitees":invitees})
all_invitees = []
for i,row in new_df.iterrows():
invitees.extend(row[1])
invitees_count = dict(Counter(all_invities))
答案 3 :(得分:0)
just for fun
df['invitees'].apply(pd.Series).unstack().reset_index(name='n').drop('level_1', axis=1).dropna().groupby('n').count().to_dict()['level_0']
{'mike': 2, 'peter': 3, 'simon': 1}