对于每个站点和每个月,我正在尝试使用Pandas DataFrame和三个循环来计算附近站点的值总和。最初的DataFrame是下面的df1。
| Site name | Lat | Lon | March 2018 | April 2018 | May 2018 |
|-----------|-------|--------|------------|------------|----------|
| A | 10.0 | 15.0 | 1 | 2 | 3 |
| B | 10.1 | 15.0 | 1 | 2 | 3 |
| C | 12.0 | 100.0 | 1 | 2 | 3 |
循环1)从包含数据的第一列开始
循环2)抓取第一个站点的坐标以与所有其他站点进行比较
循环3)将第一个站点的坐标与每个站点进行比较;如果它们之间的距离(使用Haversine公式计算,未列出)小于 Rtarget ,则添加到 msum ,这是 Rtarget 到初始站点的距离。
输出将转到空的DataFrame df2,其索引和列与df1相同。在循环2的每次迭代中,应有一个最终值 msum ,该值应与循环2中的值一起输入df2中的相同位置(索引和列)。
我的代码当前生成的df2如下所示:
| Lat | Lon | March 2018 | April 2018 | May 2018 |
| Site name |-----|-----|------------|------------|----------|
| A | 0 | 0 | 0 | 0 | 0 |
| B | 0 | 0 | 0 | 0 | 0 |
| C | 0 | 0 | 3 | 6 | 9 |
而df2应该看起来像这样:
| Lat | Lon | March 2018 | April 2018 | May 2018 |
| Site name |-----|-----|------------|------------|----------|
| A | 0 | 0 | 2 | 4 | 6 |
| B | 0 | 0 | 2 | 4 | 6 |
| C | 0 | 0 | 1 | 2 | 3 |
我的代码如下:
# Create empty data frame with identical columns
df1 = pd.DataFrame(0, index=df.index, columns=df.columns)
Rtarget = 500 # Set the target distance to X miles
# Iterate through each month of data
for column, row in df.iloc[:,2:].iteritems():
# Select the initial site and define the starting coordinates
for index, row in df.iterrows():
lat01 = row['Latitude']
lon01 = row['Longitude']
msum = 0 # set monthly sum equal to zero
# Iterate through each comparison site and define comparison coordinates
for index, row in df.iterrows():
lat02 = row['Latitude']
lon02 = row['Longitude']
distance = haversine(lon01, lat01, lon02, lat02)
if distance <= Rtarget:
msum += row[column] # Add the value to the monthly sum
# Place monthly sum into the corresponding location in the empty dataframe, df1
df1[column][index] = msum
我遗漏了什么,导致除最后一行以外的所有行均保持空白?