我在下面有一个数据框:
user | speed
------------
Anna | 1.0
Bell | 1.2
Anna | 1.3
Chad | 1.5
Bell | 1.4
Anna | 1.1
我想使用字典来记录每个用户遇到的次数并在我遍历数据帧时更新他/她的速度。
例如,我们第一次看到“安娜”的字典是:
{"Anna": [1, 1.0]}
第二次看到“安娜”时,它变成:
{"Anna": [2, 1.3], "Bell": [1, 1.2]}
最后的字典应该是:
{"Anna": [3, 1.1], "Bell": [2, 1.4], "Chad": [1, 1.5]}
计数部分很简单:
>>> import pandas as pd
>>> record = pd.DataFrame({"user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"), "speed": (1.0, 1.2, 1.3, 1.5, 1.4, 1.1)})
>>> record
speed user
0 1.0 Anna
1 1.2 Bell
2 1.3 Anna
3 1.5 Chad
4 1.4 Bell
5 1.1 Anna
>>> encounter = {}
>>> for i in record['user']:
... encounter[i] = encounter.get(i, 0) + 1
...
>>> encounter
{'Anna': 3, 'Bell': 2, 'Chad': 1}
但是创建空的list字典并更新第二个值的好方法是什么?谢谢!
答案 0 :(得分:6)
我相信这是您想要的两行内容。
import pandas as pd
record = pd.DataFrame({
"user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"),
"speed": (1.0, 1.2, 1.3, 1.5, 1.4, 1.1)
})
encounter = {}
for name, value in zip(record["user"], record["speed"]):
encounter[name] = [encounter.get(name, [0])[0] + 1, value]
zip
方法可让您同时遍历名称和速度。get
方法尝试获取记录(如果存在),否则返回列表[0]
。[0]
占据列表的第一个元素,即计数器。encounter[name]
。答案 1 :(得分:2)
使用collections.Counter
例如:
import pandas as pd
from decimal import Decimal
from collections import Counter
record = pd.DataFrame({"user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"), "speed": (1.0, 1.2, 1.3, 1.5, 1.4, 1.1)})
encounter = {}
for k,v in Counter(record["user"].tolist()).items():
encounter[k] = [v, (record[record["user"] == k]["speed"].iloc[-1]).round(1).astype(Decimal)]
print(encounter)
输出:
{'Anna': [3, 1.1], 'Chad': [1, 1.5], 'Bell': [2, 1.4]}
答案 2 :(得分:1)
和pandorable一起去
my_dictionary={}
for k, v in df.groupby('user'):
my_dictionary[k]=[len(v),v.iloc[-1]['speed']]
print(my_dictionary)
{'Anna': [3, 1.1], 'Bell': [2, 1.4], 'Chad': [1, 1.5]}