我在其中一列中有一个带有ip地址的数据框,我想添加一个新列 根据位于下层IP地址和上层IP地址之间的IP地址的位置,从另一个数据帧中称为“国家”。
import numpy as np
import pandas as pd
df1 = pd.DataFrame({'ip': [0.1,2.5,3.5]})
df2 = pd.DataFrame({'low_ip': [3,2,7,10],
'high_ip': [5,3,9,11],
'country': ['A','B','A','C']})
print(df1)
ip
0 0.1
1 2.5
2 3.5
print(df2)
low_ip high_ip country
0 3 5 A
1 2 3 B
2 7 9 A
3 10 11 C
ip country
0.1 NA
2.5 B because: 2 <= 2.5 <= 3
3.5 A because: 3 <= 3.5 <= 5
答案 0 :(得分:2)
快速而肮脏的方式:
countries = []
for i in range(len(df1)):
ip = df1.loc[i, 'ip']
country = df2.query("low_ip <= @ip <= high_ip")['country'].to_numpy()
if len(country) > 0:
countries.append(country[0])
else:
countries.append('NA')
df1['country'] = countries
print(df1)
ip country
0 0.1 NA
1 2.5 B
2 3.5 A