根据与熊猫列值的关系填充字典值

时间:2019-12-19 13:36:22

标签: python pandas dictionary geojson

说我有一个名字,lon,lat和地名的df。如果我有一个字典,可以根据df['name']列中的名称来迭代地更新这些值,那么最有效的方法是什么?

#Example df:
df = pd.DataFrame({'name':['jeff', 'susan','bill','emily'],
'lon':['25.0','43.9','18.8','22.4'],'lat':['19.3','11.2','45.3','28.0'],
'place':['Florida','Maine','Arizona','Colorado']})

给予:

    name   lon   lat     place
0   jeff  25.0  19.3   Florida
1  susan  43.9  11.2     Maine
2   bill  18.8  45.3   Arizona
3  emily  22.4  28.0  Colorado
geodict = {
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [df.lon, df.lat]
  },
  "properties": {
    "place_name": df.place
  }
}

我想填充字典,以便根据列中的名称填充df['lon']df['lat']df['place']列中的坐标。

我正在根据df['name']中的条目以及当时我正在查找的名称来获取数据。

names = df['name'].values.tolist()

for n in range(len(names)):
<do some stuff>
    if names[n] in df['name'].values:
        <not sure what to do after this..., probably some k,v in geodict thing?>

我希望上述字典根据df数据进行更新,如下所示。最终目标是发送到geoJSON。

我的最终输出是这样的:

geodict = {
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": ['25.0', '19.3']
  },
  "properties": {
    "place_name": 'Florida'
  }
}
#and so on for each entry.

还有一些其他数据打算用.update添加到字典中,但是我试图保持简短,并且它不依赖于字典中的df数据或几何数据。

2 个答案:

答案 0 :(得分:2)

您可以通过使用to_dict()的{​​{1}}方法,然后根据结果创建DataFrame来实现此目的:

geodict

输出:

geodicts = {
  name: {
      "type": "Feature",
      "geometry": {
          "type": "Point",
          "coordinates": [vals.get('lon'), vals.get('lat')]
          },
      "properties": {
          "place_name": vals.get('place')
          }
        } for name, vals in df.set_index('name').T.to_dict().items()
    }

如果您只是一个接一个地需要它们:

pprint(geodicts)
{'bill': {'geometry': {'coordinates': ['18.8', '45.3'], 'type': 'Point'},
          'properties': {'place_name': 'Arizona'},
          'type': 'Feature'},
 'emily': {'geometry': {'coordinates': ['22.4', '28.0'], 'type': 'Point'},
           'properties': {'place_name': 'Colorado'},
           'type': 'Feature'},
 'jeff': {'geometry': {'coordinates': ['25.0', '19.3'], 'type': 'Point'},
          'properties': {'place_name': 'Florida'},
          'type': 'Feature'},
 'susan': {'geometry': {'coordinates': ['43.9', '11.2'], 'type': 'Point'},
           'properties': {'place_name': 'Maine'},
           'type': 'Feature'}}

用法:

def get_geodict(name):
    item = df.set_index('name').loc[name]
    geodict = {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [item.lon, item.lat]
      },
      "properties": {
        "place_name": item.place
      }
    }
    return geodict

如果您已经拥有>>> get_geodict('jeff') {'geometry': {'coordinates': ['25.0', '19.3'], 'type': 'Point'}, 'properties': {'place_name': 'Florida'}, 'type': 'Feature'}, 并且只想更新它:

geodict

用法:

def update_geodict(geodict, name):
    item = df.set_default('name').loc[name]
    geodict.setdefault('geometry', {}).update({'coordinates': [item.lon, item.lat]})
    geodict.setdefault('properties', {}).update({'place_name': item.place})
    return geodict

答案 1 :(得分:2)

直接:

In [306]: input_name = 'bill'                                                                                                

In [307]: row = df[df['name'] == input_name].iloc[0]                                                                         

In [308]: geodict = {"type": "Feature",   
     ...:            "geometry": {"type": "Point", "coordinates": [row.lon, row.lat]},  
     ...:            "properties": {"place_name": row.place}  
     ...: }  
     ...: print(geodict)                                                                                                     
{'type': 'Feature', 'geometry': {'type': 'Point', 'coordinates': ['18.8', '45.3']}, 'properties': {'place_name': 'Arizona'}}

如果您需要生成“ geodict” 以便可重复使用,请将上述方法包装到函数中。