我们有两个地理区域:人口普查区和正方形网格。网格数据集仅包含有关人口计数的信息。我们了解每个普查区的总收入。 我们要做的是将这些收入数据从普查区域分配到网格单元。
这是地理分析中一个非常普遍的问题,可能有很多解决方法。我们不仅要考虑人口普查区和网格单元之间的空间重叠,还要考虑每个单元的数量。这主要是为了避免在人口普查区可能包含仅居住在小区域的人们的情况下出现问题。
我们在下面提供一个可重现的示例(使用R和 sf 包),并使用从我们的地理位置提取的样本,到目前为止已找到针对该问题的解决方案。我们希望看到其他人是否有替代(更有效)的解决方案来检查我们的结果是否正确。
library(sf)
library(dplyr)
library(readr)
# Files
download.file("https://github.com/ipeaGIT/acesso_oport/raw/master/test/shapes.RData", "shapes.RData")
load("shapes.RData")
# Open tracts and calculate area
tract <- tract %>%
mutate(area_tract = st_area(.))
# Open grid squares and calculate area
square <- square %>%
mutate(area_square = st_area(.))
ui <-
# Create spatial units for all intersections between the tracts and the squares (we're calling these "piece")
st_intersection(square, tract) %>%
# Calculate area for each piece
mutate(area_piece = st_area(.)) %>%
# Compute the proportion of each tract that's inserted in that piece
mutate(area_prop_tract = area_piece/area_tract) %>%
# Compute the proportion of each square that's inserted in that piece
mutate(area_prop_square = area_piece/area_square) %>%
# Based on the square's population, compute the population that lives in that piece
mutate(pop_prop_square = square_pop * area_prop_square) %>%
# Compute the population proportion of each square that is within the tract
group_by(id_tract) %>%
mutate(sum = sum(pop_prop_square)) %>%
ungroup() %>%
# Compute population of each piece whitin the tract
mutate(pop_prop_square_in_tract = pop_prop_square/sum) %>%
# Compute income within each piece
mutate(income_piece = tract_incm* pop_prop_square_in_tract)
# Final agreggation by squares
ui_fim <- ui %>%
# Group by squares and population and sum the income for each piece
group_by(id_square, square_pop) %>%
summarise(square_income = sum(income_piece, na.rm = TRUE))
谢谢!
答案 0 :(得分:1)
根据您要使用的插值方法,我可能会为您提供帮助我开发的解决方案。 areal
包实现了区域加权插值,我在自己的研究中通过从美国人口普查地理区域和网格正方形之间进行插值来使用它。您可以检出包裹的网站(及相关的插图)here。希望这是有用的!