我以纬度,经度和计数格式提供了一些客户数据。我需要创建ggplot热图所需的所有数据,但我不知道如何将其放入ggplot所需的格式中。
我试图通过0.01 Lat和0.01 Lon块(典型热图)中的总计数来聚合数据,我本能地认为“tapply”。这会根据需要按块大小创建一个很好的摘要,但格式错误。此外,我真的希望将空的Lat或Lon块值包含为零,即使没有任何内容......否则热图最终会看起来条纹奇怪。
非常感谢您的帮助。
我已经在下面的代码中创建了我的数据子集供您参考:
# m is the matrix of data provided
m = matrix(c(44.9591051,44.984884,44.984884,44.9811399,
44.9969096,44.990894,44.9797023,44.983334,
-93.3120017,-93.297668,-93.297668,-93.2993524,
-93.2924484,-93.282462,-93.2738911,-93.26667,
69,147,137,22,68,198,35,138), nrow=8, ncol=3)
colnames(m) <- c("Lat", "Lon", "Count")
m <- as.data.frame(m)
s = as.data.frame((tapply(m$Count, list(round(m$Lon,2), round(m$Lat,2)), sum)))
s[is.na(s)] <- 0
# Data frame "s" has all the data, but not exactly in the format desired...
# First, it has a column for each latitude, instead of one column for Lon
# and one for Lat, and second, it needs to have 0 as the entry data for
# Lat / Lon pairs that have no other data. As it is, there are only zeroes
# when one of the other entries has a Lat or Lon that matches... if there
# are no entries for a particular Lat or Lon value, then nothing at all is
# reported.
desired.format = matrix(c(44.96,44.96,44.96,44.96,44.96,
44.97,44.97,44.97,44.97,44.97,44.98,44.98,44.98,
44.98,44.98,44.99,44.99,44.99,44.99,44.99,45,45,
45,45,45,-93.31,-93.3,-93.29,-93.28,-93.27,-93.31,
-93.3,-93.29,-93.28,-93.27,-93.31,-93.3,-93.29,
-93.28,-93.27,-93.31,-93.3,-93.29,-93.28,-93.27,
-93.31,-93.3,-93.29,-93.28,-93.27,69,0,0,0,0,0,0,
0,0,0,0,306,0,0,173,0,0,0,198,0,0,0,68,0,0),
nrow=25, ncol=3)
colnames(desired.format) <- c("Lat", "Lon", "Count")
desired.format <- as.data.frame(desired.format)
minneapolis = get_map(location = "minneapolis, mn", zoom = 12)
ggmap(minneapolis) + geom_tile(data = desired.format, aes(x = Lon, y = Lat, alpha = Count), fill="red")
答案 0 :(得分:3)
这是使用geom_hex和stat_density2d进行的攻击。通过截断坐标来制作箱子的想法让我有些不安。
你所拥有的是计数数据,给出lat / longs,这意味着理想情况下你需要一个权重参数,但据我所知,这并不是用geom_hex实现的。相反,我们通过重复计数变量的行来破解它,类似于方法here。
## hack job to repeat records to full count
m<-as.data.frame(m)
m_long <- with(m, m[rep(1:nrow(m), Count),])
## stat_density2d
ggplot(m_long, aes(Lat, Lon)) +
stat_density2d(aes(alpha=..level.., fill=..level..), size=2,
bins=10, geom=c("polygon","contour")) +
scale_fill_gradient(low = "blue", high = "red") +
geom_density2d(colour="black", bins=10) +
geom_point(data = m_long)
## geom_hex alternative
bins=6
ggplot(m_long, aes(Lat, Lon)) +
geom_hex(bins=bins)+
coord_equal(ratio = 1/1)+
scale_fill_gradient(low = "blue", high = "red") +
geom_point(data = m_long,position = "jitter")+
stat_binhex(aes(label=..count..,size=..count..*.5), size=3.5,geom="text", bins=bins, colour="white")
这些分别产生以下内容: 和分档版本:
编辑:
使用底图:
map +
stat_density2d(data = m_long, aes(x = Lon, y = Lat,
alpha=..level.., fill=..level..),
size=2,
bins=10,
geom=c("polygon","contour"),
inherit.aes=FALSE) +
scale_fill_gradient(low = "blue", high = "red") +
geom_density2d(data = m_long, aes(x = Lon, y=Lat),
colour="black", bins=10,inherit.aes=FALSE) +
geom_point(data = m_long, aes(x = Lon, y=Lat),inherit.aes=FALSE)
## and the hexbin map...
map + #ggplot(m_long, aes(Lat, Lon)) +
geom_hex(bins=bins,data = m_long, aes(x = Lon, y = Lat),alpha=.5,
inherit.aes=FALSE) +
geom_point(data = m_long, aes(x = Lon, y=Lat),
inherit.aes=FALSE,position = "jitter")+
scale_fill_gradient(low = "blue", high = "red")