我正在尝试按降序绘制带有两个变量的数据框。这两个变量都是因素。我想在绘图时考虑两个变量的频率,就像excel中的数据透视表一样。
我试图使用整洁的方式对变量进行降序分组,计数和排序。
library(tidyverse)
# Create a data frame that simulates the data that needs to be modeled
#Create data frame that will hold data for simulation
df1 = as.data.frame(replicate(2,
sample(c("A", "B", "C", "D", "E","F","G","H","I","J"),
50,
rep=TRUE)))
#Replace V2 column with System Nomenclature (Simulated)
df1$V2 <- sample(1:4, replace = TRUE, nrow(df1))
#Make V2 into a Factor
df1$V2 = as.factor(df1$V2)
#Create frequency table
df2 <- df1 %>%
group_by(V1, V2) %>%
summarise(counts = n()) %>%
ungroup() %>%
arrange(desc(counts))
#Plot the 2 variable data
ggplot(df2,
aes(reorder(x = V1, -counts) ,
y = counts,
fill = V2)) +
geom_bar(stat = "identity")
我希望该图以V1的频率但以V2填充的降序排列数据。就像excel中的数据透视表功能一样。我也只想按V1的频率显示前5名,并用V2填充。
答案 0 :(得分:2)
您可以使用fct_reorder和fct_rev实现所需的功能
#Create data frame that will hold data for simulation
df1 = as.data.frame(replicate(2, sample(c("A", "B", "C", "D", "E","F","G","H","I","J"), 50, rep=TRUE)))
#Replace V2 column with System Nomenclature (Simulated)
df1$V2 <- sample(1:4, replace = TRUE, nrow(df1))
#Make V2 into a Factor
df1$V2 = as.factor(df1$V2)
#Create frequency table
df2 <- df1 %>% group_by(V1, V2) %>%
summarise(counts = n()) %>%
ungroup() %>%
arrange(desc(counts))
#Plot the 2 variable data.
##fct_reorder rearranges the factors, and fct_rev reverses the order, so it is descending[![enter image description here][1]][1]
ggplot(df2, aes(fct_rev(fct_reorder(V1, counts,fun = sum)) , y = counts, fill = V2)) +
geom_bar(stat = "identity")
##Keeping only top 5
df2 %>% group_by(V1) %>%
filter(sum(counts) > 5) %>%
ggplot(aes(x = fct_rev(fct_reorder(V1,
counts,fun = sum)),
y = counts, fill = V2)) +
geom_bar(stat = "identity")