我有一个这样的数据框:
import numpy as np
import pandas as pd
df = pd.DataFrame({'carrier': ['c1','c1','c1','c2','c2','c2','c3','c4','c5','c5'],
'airport': ['a1','a3','a1','a1','a2','a2','a3','a4','a4','a1'],
})
df
carrier airport
0 c1 a1
1 c1 a3
2 c1 a1
3 c2 a1
4 c2 a2
5 c2 a2
6 c3 a3
7 c4 a4
8 c5 a4
9 c5 a1
我想找到服务于机场的承运人的数量,以便至少有2个不同的承运人为机场提供服务。
该怎么做?
必填输出:
airport carrier n_carrier
a1 c1 3 # airport a1 is served by 3 distinct carriers
a3 c1 2 # airport a3 is served by 2 distinct carriers
a1 c2 3 # NOTE: here we do not see a2 because it has only
a3 c3 2 # one carrier, so it is excluded
a4 c4 2
a4 c5 2 # airport a4 is served by 2 distinct carriers
a1 c5 3
答案 0 :(得分:1)
update := bson.M{
"$set": bson.M{
"name": obj.Name,
},
"$addToSet": bson.M{"activities": bson.M{"$each": obj.Activities }},
}
cursor, err := collection.UpdateOne(context.Background(), filter, update )
这应该给您您想要的东西。按唯一性数量转换分组依据,然后仅查看大于1的内容,然后删除重复项