我有这个数据框:
[
{
"name": "4134",
"calls": [
{
"MobileNo": "7013658596"
}
]
}
]
我只想用熊猫来计算checkout_item -s发生在checkout一起的情况
用户:56610 num_of_checkout_item:3
用户:56611 num_of_checkout_item:2
有人有什么主意吗?
答案 0 :(得分:1)
我假设“仅在结帐时才发生checkcount_item-s计数”意味着checkout_item
和checkout
的时间戳必须相同,checkout_item
才能计数。
def count_items(group):
if not "checkout" in group.action.values:
return 0
return (group.action == "checkout_item").sum()
>>> df.groupby(["user", "timestamp"]).apply(count_items).groupby("user").sum()
user
56610 3
56611 2
dtype: int64
答案 1 :(得分:0)
如何?
users = [56610, 56611]
print(len(df[(df.user.isin(users)) & (df.action == "checkout_item")]))
或每个用户:
for user in users:
counts = len(df[(df.user == user) & (df.action == "checkout_item")])
print(f"user: {user} num_of_checkout_item: {counts}")
答案 2 :(得分:0)
不确定,但是如果要删除特定行,请尝试一下。
df = df.drop(df[(df.user == 56610) & (df.action =='checkout_item')].index)
df = df.drop(df[(df.user == 56611) & (df.action =='checkout_item')].index)