我有数据框
category dictionary
moto {'motocycle':10, 'buy":8, 'motocompetition':7}
shopping {'buy':200, 'order':20, 'sale':30}
IT {'iphone':214, 'phone':1053, 'computer':809}
shopping {'zara':23, 'sale':18, 'sell':20}
IT {'lenovo':200, 'iphone':300, 'mac':200}
我需要groupby类别,结果连接字典并选择具有最大值的3个键。然后获取数据框,在category
列中我有唯一的类别,在列data
中我有列表和键。
我知道,我可以使用Counter
来连接词汇,但我不知道,这对于类别是怎么做的。
欲望输出
category data
moto ['motocycle', 'buy', 'motocompetition']
shopping ['buy', 'sale', 'zara']
IT ['phone', 'computer', 'iphone']
答案 0 :(得分:3)
您可以将groupby
与自定义功能nlargest
和Index.tolist
一起使用:
df = pd.DataFrame({
'category':['moto','shopping','IT','shopping','IT'],
'dictionary':
[{'motocycle':10, 'buy':8, 'motocompetition':7},
{'buy':200, 'order':20, 'sale':30},
{'iphone':214, 'phone':1053, 'computer':809},
{'zara':23, 'sale':18, 'sell':20},
{'lenovo':200, 'iphone':300, 'mac':200}]})
print (df)
category dictionary
0 moto {'motocycle': 10, 'buy': 8, 'motocompetition': 7}
1 shopping {'sale': 30, 'buy': 200, 'order': 20}
2 IT {'phone': 1053, 'computer': 809, 'iphone': 214}
3 shopping {'sell': 20, 'zara': 23, 'sale': 18}
4 IT {'lenovo': 200, 'mac': 200, 'iphone': 300}
import collections
import functools
import operator
def f(x):
#some possible solution for sum values of dict
#http://stackoverflow.com/a/3491086/2901002
return pd.Series(functools.reduce(operator.add, map(collections.Counter, x)))
.nlargest(3).index.tolist()
print (df.groupby('category')['dictionary'].apply(f).reset_index())
category dictionary
0 IT [phone, computer, iphone]
1 moto [motocycle, buy, motocompetition]
2 shopping [buy, sale, zara]
答案 1 :(得分:1)
df = pd.DataFrame(dict(category=['moto', 'shopping', 'IT', 'shopping', 'IT'],
dictionary=[
dict(motorcycle=10, buy=8, motocompetition=7),
dict(buy=200, order=20, sale=30),
dict(iphone=214, phone=1053, computer=809),
dict(zara=23, sale=18, sell=20),
dict(lenovo=200, iphone=300, mac=200),
]))
def top3(x):
return x.dropna().sort_values().tail(3)[::-1].index.tolist()
df.dictionary.apply(pd.Series).groupby(df.category).sum().apply(top3, axis=1)
category
IT [phone, computer, iphone]
moto [motorcycle, buy, motocompetition]
shopping [buy, sale, zara]
dtype: object