来自for循环的嵌套dict向所有嵌套键添加相同的值

时间:2019-09-29 16:53:59

标签: python dictionary for-loop

我有包含多边形的地址数据和shapefile,并尝试确定每个地址与每个多边形的最近距离(以英里为单位),然后创建一个嵌套字典,其中包含所有信息,格式为:

nested_dict = {poly_1: {address1: distance, address2 : distance}, 
               poly2: {address1: distance, address2: distance}, etc}

我正在使用的完整适用代码是:

import pandas as pd
from shapely.geometry import mapping, Polygon, LinearRing, Point
import geopandas as gpd
from math import radians, cos, sin, asin, sqrt

address_dict = {k: [] for k in addresses_geo.input_string}
sludge_dtc = {k: [] for k in sf_geo.unique_name}

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))
    r = 3956 # Radius of earth in miles. Use 6371 for kilometers
    return c * r

# Here's the key loop that isn't working correctly
for unique_name, i in zip(sf_geo.unique_name, sf_geo.index):
    for address, pt in zip(addresses_geo.input_string, addresses_geo.index):
        pol_ext = LinearRing(sf_geo.iloc[i].geometry.exterior.coords)
        d = pol_ext.project(addresses_geo.iloc[pt].geometry)
        p = pol_ext.interpolate(d)
        closest_point_coords = list(p.coords)[0]
        # print(closest_point_coords)
        dist = haversine(addresses_geo.iloc[pt].geometry.x,
                         addresses_geo.iloc[pt].geometry.y,
                         closest_point_coords[0], closest_point_coords[1])
        address_dict[address] = dist
    sludge_dtc[unique_name] = address_dict
# Test results on a single address
addresses_with_sludge_distance = pd.DataFrame(sludge_dtc)
print(addresses_with_sludge_distance.iloc[[1]].T)

如果我将此代码分解并尝试计算单个多边形的距离,它似乎可以正常工作。但是,当我创建DataFrame并检查地址时,它为每个多边形列出了相同的距离。

因此,内部dict-key'123 Main Street'对于外部dict中的每个多边形键都有5.25英里,而'456 South Street'对于外部dict中的每个多边形键都有6.13英里。 (弥补例子。)

我意识到我必须以设置for循环的方式做一些愚蠢的事情,但是我无法弄清楚。我已经颠倒了for语句的顺序,陷入了缩进的局面-所有结果都是相同的。

为了明确起见,我想做的是:

  • 先获取一个多边形,然后
  • 对于地址数据中的每个地址,找到与该多边形的距离,然后将地址作为键并将距离作为值添加到address_dict字典中
  • 计算完所有地址后,将整个地址字典添加为sludge_dtc中的多边形键值
  • 移至下一个多边形并继续

有什么想法我想念的吗?

1 个答案:

答案 0 :(得分:1)

问题非常简单,您始终使用相同的address_dict实例。 您只需要在每个键循环内重新创建它即可。

import pandas as pd
from shapely.geometry import mapping, Polygon, LinearRing, Point
import geopandas as gpd
from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))
    r = 3956 # Radius of earth in miles. Use 6371 for kilometers
    return c * r

sludge_dtc = {k: [] for k in sf_geo.unique_name}

# Here's the key loop that isn't working correctly
for unique_name, i in zip(sf_geo.unique_name, sf_geo.index):

    address_dict = {k: [] for k in addresses_geo.input_string}

    for address, pt in zip(addresses_geo.input_string, addresses_geo.index):
        pol_ext = LinearRing(sf_geo.iloc[i].geometry.exterior.coords)
        d = pol_ext.project(addresses_geo.iloc[pt].geometry)
        p = pol_ext.interpolate(d)
        closest_point_coords = list(p.coords)[0]
        # print(closest_point_coords)
        dist = haversine(addresses_geo.iloc[pt].geometry.x,
                         addresses_geo.iloc[pt].geometry.y,
                         closest_point_coords[0], closest_point_coords[1])
        address_dict[address] = dist
    sludge_dtc[unique_name] = address_dict
# Test results on a single address
addresses_with_sludge_distance = pd.DataFrame(sludge_dtc)
print(addresses_with_sludge_distance.iloc[[1]].T)

另一个注意事项:

您正在创建将空列表作为值的空字典,但是直接设置值之后(替换了空列表)。如果您需要收集值列表,则应将append个值添加到现有列表中,例如:

address_dict[address].append(dist)

sludge_dtc[unique_name].append(address_dict)