我有一个带有列(Lon,Lat,V1,V2,V3)的大df1和一个大df2(V4,V5,Lat,Lon,V6)。 dfs坐标不完全匹配。 df2具有不同的行号。我想要: 1)根据(abs(df1.Lon-df2.Lon <= 0.11))&(abs(df1.Lat-df2.Lat <= 0.11))找到最接近df2(Lon,Lat)的df1(Lon,Lat)。 ) 2)使用列(df1.Lon,df1.Lat,df1.V1,df2.V6)创建新的df3。
df1:
Lon,Lat,V1,V2,V3
-94.9324,34.9099,5.0,66.9,46.6
-103.524,34.457,6.0,186.7,3.8
-92.5145,38.7823,4.0,188.7,273.5
-92.5143,37.3182,2.0,78.8,218.4
-92.5142,36.6965,5.0,98.5,27.7
-89.2187,36.4448,7.3,79.8,35.8
df2:
V4,V5,Lat,Lon,V6
20190329,10,35.0,-94.9,105.9
20180329,11,34.5,-103.5,305.9
20170329,15,38.7,-92.5,206.0
20160329,14,36.5,-89.22,402.1
20150329,13,36.7,-92.6,316.1
20140329,05,37.4,-92.5,290.0
20130329,05,33.8,-89.2,250.0
df3:
Lon,Lat,V1,V6
-94.9324,34.9099,5.0,105.9
-103.524,34.457,6.0,305.9
-92.5145,38.7823,4.0,206.0
-92.5143,37.3182,2.0,290.0
-92.5142,36.6965,5.0,316.1
-89.2187,36.4448,7.3,402.1
不同的代码不起作用:
df3 = df1.loc[~((abs(df2.Lat - df1.Lat) <= 0.11) & (abs(df2.Lon - df1.Lon) <= 0.11))]
df3 = df1.where((abs(df1[df1.Lon] - df2[df2.Lon]) <=0.11) & (abs(df1[df1.Lat] -df2[df2.Lat]) <=0.11))
df3 = pd.merge(df1, df2, on=[(abs(df1.Lon-df2.Lon)<=0.11), (abs(df1.Lat-df2.Lat)<=0.11)], how='inner')
答案 0 :(得分:0)
这是可能的,但是使用交叉联接,因此如果DataFrames
大,则需要大量内存:
df = pd.merge(df1.assign(A=1), df2.assign(A=1), on='A', how='outer', suffixes=('','_'))
cols = ['Lon','Lat','V1','V6']
df3 = df[(((df.Lat_ - df.Lat) <= 0.11).abs() & ((df.Lon_ - df.Lon).abs() <= 0.11))]
df3 = df3.drop_duplicates(subset=df1.columns)[cols]
print (df3)
Lon Lat V1 V6
0 -94.9324 34.9099 5.0 105.9
8 -103.5240 34.4570 6.0 305.9
16 -92.5145 38.7823 4.0 206.0
25 -92.5143 37.3182 2.0 316.1
32 -92.5142 36.6965 5.0 316.1
38 -89.2187 36.4448 7.3 402.1