使用频率表创建堆积条

时间:2017-03-30 13:04:08

标签: r ggplot2 merge melt

实际上,我正在使用两个名称为identified_modification_tableunidentified_modifications_table

的频率表

这些文件的结构是这样的:

identified_modification_table

Modifications   | Frequency
MOD:42123       | 12
MOD:1234        | 7
MOD:7618        | 36
MOD:411232      | 51

unidentified_modifications_table

Modifications   | Frequency
MOD:42123       | 12  
MOD:12          | 20
MOD:7618        | 36
MOD:411232      | 51

我想合并这些文件并创建此输出,以便创建像此示例一样的堆叠条形图。

Modifications   | Frequency.1 | Frequency.2 
MOD:42123       | 12          | 12
MOD:1234        | 7           | NA
MOD:12          | NA          | 20
MOD:7618        | 36          | 36
MOD:411232      | 51          | 51

All-in-one wp migration

我试图使用此代码合并表,并在值不存在的地方添加NA。

df_final <- cbind.data.frame(df1, df2[match(df1$modifications, df2$modifications), ]);

但这不能正常工作,我也不知道为什么。

在此之后,我想我应该只使用melt和ggplot2堆积条:

df_barplot <- melt(df,measure.vars = names(df))

ggplot((df_barplot), aes(x = value, fill = variable)) + 
    geom_bar(stat = "count", position = "dodge") + 
    theme(axis.text.x = element_text(angle = 20, hjust = 0.5, vjust = -0.1)) + 
    guides(fill=FALSE)+
    labs("Barplot") + 
    xlab("Values")+
    ylab("Frequency")+
    theme(text = element_text(size=18), axis.text.x = element_text(angle = 90, hjust = 1, size = 15), axis.text.y=element_text(size = 15))

有谁知道我怎么做到这一点?

这是一个可重复的例子:

df1 <- data.frame(modifications=c("MOD:214", "MOD:3","MOD:24","MOD:44","MOD:123", "MOD:123", "MOD:212"), Frequency=c(1,41,616,727,828,8993,383))


  df2 <- data.frame(modifications=c("MOD:214", "MOD:3","MOD:24","MOD:445","MOD:12", "MOD:123", "MOD:212"), Frequency=c(1,43,64,77,88,893,38))

谢谢

2 个答案:

答案 0 :(得分:2)

以下是整齐的方式:

library(tidyverse)
merged_df <- full_join(df1, df2, by = "modifications")
merged_df <- gather(merged_df, key = Category, value = Frequency, -modifications)

图表:

ggplot(merged_df, aes(x = modifications, y = Frequency, fill = Category)) + 
geom_col(position = "dodge")

enter image description here

答案 1 :(得分:2)

我认为这可以做你想要的事情

df3<-merge(df1,df2, by = "modifications",all = T)

library(reshape2)
df3<- melt(df3)
df3$variable<-factor(df3$variable,labels = c("modifications1","modifications2"))

library(ggplot2)
ggplot(df3, aes(x = modifications, y = value, fill = variable)) + 
  geom_bar(stat = "identity",position = "dodge")

编辑:添加全部= T以保持在任一表中出现的所有频率

enter image description here