编辑:添加了一行没有匹配的索引来演示预期的行为
我有以下两个DataFrame:
requests
:
requests
asn pop country
1 1 us 100
br 50
2 br 200
3 hk 150
4 uk 100
2 1 us 300
...
traffic
:
total capacity
asn pop
1 1 53 1000
2 15 1000
3 103 10000
2 1 254 10000
...
我希望在requests
DataFrame中添加一个新列,其值等于traffic["total"] / traffic["capacity"]
,并在两个匹配的索引上对齐。
我尝试了以下内容:
>>>requests["network"] = traffic["total"] / traffic["capacity"]
>>>requests
requests network
asn pop country
1 1 us 100 NaN
br 50 NaN
2 br 200 NaN
3 hk 150 NaN
4 uk 100 NaN
2 1 us 300 NaN
...
当所有三个索引都可用时,这对我来说很有用。但是在这个例子中我只有两个索引,所以它似乎失败了。
>>>requests
requests network
asn pop country
1 1 us 100 0.053
br 50 0.053
2 br 200 0.015
3 hk 150 0.0103
4 uk 100 NaN
2 1 us 300 0.0254
...
答案 0 :(得分:3)
您的MultiIndex
未匹配存在问题,因此请NaN
s。解决方案是添加reindex
。
requests['network'] = traffic["total"].div(traffic["capacity"])
.reindex(requests.index, method='ffill')
print (requests)
requests network
asn pop country
1 1 us 100 0.0530
br 50 0.0530
2 br 200 0.0150
3 hk 150 0.0103
2 1 us 300 0.0254
包含reset_index
+ set_index
的旧解决方案:
requests = requests.reset_index(level=2)
requests['network'] = traffic["total"].div(traffic["capacity"])
requests = requests.set_index('country', append=True)
print (requests)
requests network
asn pop country
1 1 us 100 0.0530
br 50 0.0530
2 br 200 0.0150
3 hk 150 0.0103
2 1 us 300 0.0254
答案 1 :(得分:1)
你可以试试这个..
requestes=requestes.reset_index().set_index(['asn','pop'])
requestes['network']=traffic["total"] / traffic["capacity"]
requestes.reset_index().set_index(['asn','pop','country'])
Out[140]:
requests network
asn pop country
1 1 us 100 0.0530
br 50 0.0530
2 br 200 0.0150
3 hk 150 0.0103
2 1 us 300 0.0254