pandas dataframe中值的向量化查找:另一个例子

时间:2016-05-14 09:56:36

标签: python pandas lookup vlookup

我在寻找例子 Vectorized look-up of values in Pandas dataframe

但不知怎的,我的问题有点不同,我找不到正确的方法,这是一个简单的vlookup

所以我有一个DataFrame

PoliceStations_raw=pd.DataFrame(
[['BAYVIEW'   ,37.729732,-122.397981],
 ['CENTRAL'   ,37.798732,-122.409919],
 ['INGLESIDE' ,37.724676,-122.446215],
 ['MISSION'   ,37.762849,-122.422005],
 ['NORTHERN'  ,37.780186,-122.432467],
 ['PARK'      ,37.767797,-122.455287],
 ['RICHMOND'  ,37.779928,-122.464467],
 ['SOUTHERN'  ,37.772380,-122.389412],
 ['TARAVAL'   ,37.743733,-122.481500],
 ['TENDERLOIN',37.783674,-122.412899]],columns=['PdDistrict','XX','YY'])

我也定义了

PoliceStations=PoliceStations_raw.transpose()

然后我有另一个表df,其中有一个'PdDistrict'列,其中包含一个分类变量,可以采用'BAYVIEW','CENTRAL'等值...

我想要一个专栏 df ['XX']会为PoliceStations_raw中的相应条目返回每行df ...

我无法找到正确的语法...感谢您的帮助

如果可能的话,我更喜欢涉及PoliceStations_raw的语法(而不是转置表),因为我觉得这个表格更“自然”......

我尝试了这个,但它不起作用

df_raw['value'] = PoliceStations.lookup('XX',df_raw['PdDistrict'])
  

----------------------------------------------- ---------------------------- ValueError Traceback(最近一次调用   最后)in()   ----> 1 df_raw ['value'] = PoliceStations.lookup('XX',df_raw ['PdDistrict'])

     

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.pyc   在查找中(self,row_labels,col_labels)2641 n =   len(row_labels)2642 if n!= len(col_labels):    - > 2643引发ValueError('行标签必须与列标签大小相同')2644 2645 thresh = 1000

     

ValueError:行标签必须与列标签大小相同

虽然我认为我没有犯标签错误

df_raw['PdDistrict'].cat.categories
  

索引([u'BAYVIEW',u'CENTRAL',u'INGLESIDE',u'MISSION',u'NORTHERN',   u'PARK”,          u'RICHMOND',u'SOUTHERN',u'TARAVAL',u'TENDERLOIN'],         DTYPE = '对象')

编辑:

我也在尝试以下方法:

PoliceStations_raw=pd.DataFrame(
[['BAYVIEW'   ,37.729732,-122.397981],
 ['CENTRAL'   ,37.798732,-122.409919],
 ['INGLESIDE' ,37.724676,-122.446215],
 ['MISSION'   ,37.762849,-122.422005],
 ['NORTHERN'  ,37.780186,-122.432467],
 ['PARK'      ,37.767797,-122.455287],
 ['RICHMOND'  ,37.779928,-122.464467],
 ['SOUTHERN'  ,37.772380,-122.389412],
 ['TARAVAL'   ,37.743733,-122.481500],
 ['TENDERLOIN',37.783674,-122.412899]],columns=['PdDistrict','XX','YY'])


df1=pd.DataFrame([[0,'CENTRAL'],[1,'TARAVAL'],[3,'CENTRAL'],[2,'BAYVIEW']])
df1.columns = ['Index','PdDistrict']


  Index PdDistrict
0   0   CENTRAL
1   1   TARAVAL
2   3   CENTRAL
3   2   BAYVIEW

尽管输入了sort = False,但返回的对象已经合并了表,但是使用PdDistrict作为一些索引,并且更改了原始左数据帧的行的顺序。

请帮忙!

pd.merge(df1,PoliceStations_raw,sort=False)

正在给我这个

  Index PdDistrict  XX        YY
0   0   CENTRAL 37.798732   -122.409919
1   3   CENTRAL 37.798732   -122.409919
2   1   TARAVAL 37.743733   -122.481500
3   2   BAYVIEW 37.729732   -122.397981

0 个答案:

没有答案