拆分数据框并从数据框列表中创建多面板散点图

时间:2018-05-31 02:30:27

标签: r plot

我有一个像这样的数据框:

set.seed(453)

year= as.factor(c(rep("1998", 20), rep("1999", 16)))
lepsp= c(letters[seq(from = 1, to = 20 )], c('a','b','c'),letters[seq(from =8, to = 20 )]) 
freq= c(sample(1:15, 20, replace=T), sample(1:18, 16,replace=T))
df<-data.frame(year, lepsp, freq)

df<- 
  df %>%
  group_by(year) %>%
  mutate(rank = dense_rank(-freq))

每年freqlepsp的频率rank列在freq列中。较大的rank值对应最小的freq值,较小的rank值对应最大的lepsp值。如果df的等级具有相同的丰度,则会重复某些排名。

我想按年将rank拆分为多个子集。然后我想在多面板图中绘制每个子集化的数据帧。基本上这是创建物种丰度曲线。 x轴为freq,y轴需要为year.vec<-unique(df$year) plot(sort(df$freq[df$year==year.vec[1]], decreasing=TRUE),bg=1,type="b", ylab="Abundance", xlab="Rank", pch=21, ylim=c(0, max(df$freq))) for (i in 2:22){ points(sort(df$freq[df$year==year.vec[i]], decreasing=TRUE), bg=i, type="b", pch=21) } legend("topright", legend=year.vec, pt.bg=1:22, pch=21)

在我的真实数据框中,我有22年的数据。我希望图形显示为2列4行,每页总共8个图形。基本上我将不得不重复这里提供的解决方案3次。

我还需要用垂直线划分25%,50%和75%的四分位数,看起来像这样(期望的结果): enter image description here

如果每个图形指定它所属的年份,那将是很好的,但由于所有轴都是相同的名称,我不希望为每个图形重复x和y标签。

我试图在同一个图表上绘制多条线但是它变得混乱。

jpeg('pract.jpg')
par(mfrow = c(6, 4))  # 4 rows and 2 columns
for (i in unique(levels(year))) {
    plot(df$rank,df$freq, type="p", main = i)
}
dev.off()

我也试过一个循环,但它不会产生输出,并且缺少一些我想要包含的参数:

library(reshape2)
library(ggplot2)
library (ggthemes)
x <- ggplot(data = df2, aes(x = rank, y = rabun)) +
  geom_point(aes(fill = "dodgerblue4")) +
  theme_few() +
  ylab("Abundance") + xlab("Rank") +
  theme(axis.title.x = element_text(size = 15),
    axis.title.y = element_text(size = 15),
    axis.text.x = element_text(size = 15),
    axis.text.y = element_text(size = 15),
    plot.title = element_blank(),            # we don't want individual plot titles as the facet "strip" will give us this
    legend.position = "none",                # we don't want a legend either
    panel.border = element_rect(fill = NA, color = "darkgrey", size = 1.25, linetype = "solid"),
    axis.ticks = element_line(colour = 'darkgrey', size = 1.25, linetype = 'solid'))     # here, I just alter to colour and thickness of the plot outline and tick marks. You generally have to do this when faceting, as well as alter the text sizes (= element_text() in theme also)
x
x <- x + facet_wrap( ~ year, ncol = 4)
x

更新 (尝试结果) enter image description here

我发帖后发现以下代码让我更接近,但仍然缺少我想要的所有功能:

var a = 5, b = 10;

if (a !== b) {
    alert("They are not equal!");
} else {
    alert("They are equal!");
}

我更喜欢基础R来修改图形功能,并且无法找到使用符合我上述所有条件的基础R的方法。任何帮助表示赞赏。

1 个答案:

答案 0 :(得分:1)

这是ggplot方法。首先,我提供了更多数据来获得3x2布局:

df = rbind(df, mutate(df, year = year + 4), mutate(df, year = year + 8))

然后我们做一些操作来按组生成分位数和标签:

df_summ =
    df %>% group_by(year) %>%
    do(as.data.frame(t(quantile(.$rank, probs = c(0, 0.25, 0.5, 0.75)))))
names(df_summ)[2:5] = paste0("q", 0:3)

df_summ_long = gather(df_summ, key = "q", value = "value", -year) %>%
    inner_join(data.frame(q = paste0("q", 0:3), lab = c("Common", "Rare-75% -->", "Rare-50% -->", "Rare-25% -->"), stringsAsFactors = FALSE))

数据形状良好,绘图非常简单:

library(ggthemes)
library(ggplot2)
ggplot(df, aes(x = rank, y = freq)) +
    geom_point() +
    theme_few() +
    labs(y = "Abundance (% of total)", x = "Rank") +
    geom_vline(data = df_summ_long[df_summ_long$q != "q0", ], aes(xintercept = value), linetype = 4, size = 0.2) + 
    geom_text(data = df_summ_long, aes(x = value, y = Inf, label = lab), size = 3, vjust = 1.2, hjust = 0) +
    facet_wrap(~ year, ncol = 2) 

enter image description here

还有一些工作要做 - 主要是稀有文字重叠。对于您的实际数据可能不是这样的问题,但如果是这样,您可以将最大y值拉到df_summ_long并稍微错开它们,实际上使用y坐标而不是{{1}像我一样把它放在顶端。