我有一个区域内个人的大型数据集。但我想通过用预定义的网格分隔我的场或空间来改变采样设计。这是数据集:
set.seed(1456)
n = 100
x=rnorm(n)
x
y = 1:n
df =data.frame(x = x, y = y, sp = sample(letters[1:5], size = 100,replace = T),stringsAsFactors = TRUE)
plot(y = df$x, x = y, pch =21,
bg = df$sp,
col = df$sp,
cex = .4)
这将创建我正在学习的区域的网格
xytransect <- expand.grid(seq(0, n, 5), seq(min(x), max(x), .6))
这是为了显示&#34;节点&#34;网格
points(xytransect, cex= 0.3, pch = 21,
bg = "pink",
col = "pink")
这只是显示该区域的实际网格。
abline(v = seq(0, n, 5), h = seq(min(x), max(x), .6))
这个想法是将物种分组,看看有多少物种存在于网格的正方形内。
我能够在整个区域根据他们的名字对物种(这里是字母)进行分组。但是如何将它们分组到我创建的网格上?
library(dplyr)
df %>%
group_by(sp) %>%
summarise(n())
是否有可能获得每个方块的中心并根据其内部的物种(字母)数量为方块着色?
答案 0 :(得分:1)
我编辑了响应以使用与Q中相同的bin定义。
ibins <- seq(0, nrow(df)+5, 5)
jbins <- seq(min(df$x)-0.6, max(df$x)+0.6, .6)
xytransect <- expand.grid(seq(0, n, 5), seq(min(x), max(x), .6))
out <- df %>%
mutate(i = min(ibins) + 5*(cut(row_number(), breaks= ibins,labels=FALSE)-1),
j = min(jbins) + 0.6*(cut(x,breaks=jbins,labels=FALSE)-1)) %>%
group_by(i,j) %>%
summarise(count=n()) %>%
ungroup() %>%
mutate(i_center = i+2.5,
j_center = j+0.3)
plot(out$i_center, out$j_center, cex = out$count/max(out$count), pch = 21, col ="orange", bg = "orange")
abline(v = seq(0, n, 5), h = seq(min(x), max(x), .6))
答案 1 :(得分:1)
这是一个答案,使用ggplot2的geom_tile()
绘制每个图块中找到的唯一种类数量填充的图块,这是OP请求的,并且与每个图块的个体数量不同。
library(dplyr)
library(ggplot2)
# Add some excess to the limits to ensure that all points are captured,
# even those on the edges.
xcoords <- seq(min(x)-1, max(x)+1, .6)
ycoords <- seq(-5, n+5, 5)
# Determine cell index and its coordinates for each individual.
df <- df %>%
mutate(x_cell_index = sapply(x, function(z) which(z < xcoords)[1]),
x_cell_min = xcoords[x_cell_index - 1],
x_cell_max = xcoords[x_cell_index],
y_cell_index = sapply(y, function(z) which(z < ycoords)[1]),
y_cell_min = ycoords[y_cell_index - 1],
y_cell_max = ycoords[y_cell_index])
# Summarize the number of unique species found in each cell.
df_cellcounts <- df %>%
group_by(x_cell_min, x_cell_max, y_cell_min, y_cell_max) %>%
summarize(n_spp = length(unique(sp)))
# Plot it.
ggplot(df_cellcounts, aes(x = (x_cell_min+x_cell_max)/2, y = (y_cell_min+y_cell_max)/2, fill = factor(n_spp))) +
geom_tile()