使用R中的堆栈条一起计算和百分比

时间:2015-03-06 16:20:50

标签: r graph ggplot2 stat

我正在尝试使用相同图表中的计数和百分比创建堆栈栏。我从Showing data values on stacked bar chart in ggplot2获得了帮助并添加了组总数,并将我的标记为on this image

By using code 

### to plot stacked bar graph with total on the top and
###    distribution of the frequency;

library(ggplot2);
library(plyr);
library(dplyr);

Year      <- c(rep(c("2006-07", "2007-08", "2008-09", "2009-10"), each = 4))
Category  <- c(rep(c("A", "B", "C", "D"), times = 4))
Frequency <- c(168, 259, 226, 340, 216, 431, 319, 368, 423, 645, 234, 685, 166, 467, 274, 251)
Data      <- data.frame(Year, Category, Frequency);


sum_count <- 
   Data %>%
  group_by(Year) %>%
  summarise(max_pos = sum(Frequency));

sum_count;


Data <- ddply(Data, .(Year), transform, pos = 
cumsum(Frequency) - (0.5 * Frequency));

Data;



# plot bars and add text
p <- ggplot(Data, aes(x = Year, y = Frequency)) +
     geom_bar(aes(fill = Category), stat="identity") +
     geom_text(aes(label=Frequency,y = pos), size = 3) +  
     geom_text(data = sum_count, 
     aes(y = max_pos, label = max_pos), size = 4,
     vjust = -0.5);

print(p);

/ 现在我想用计数覆盖每个组的百分比这是我的方法。数据,这样我们就可以计算出来了 您正在处理的每个组的百分比 /

    MergeData <- merge(Data,sum_count,by="Year");

    MergeData <- transform(MergeData,
    per_cent=round((pos/max_pos)*100,0));
    MergeData<- ddply(MergeData, .(Year), transform, per_pos = 
    cumsum(per_cent) - (0.5 * per_cent));

    # calculate percent and attach % sign;

    MergeData <- transform(MergeData,
    per_cent=paste(round((pos/max_pos)*100,0),"%"));

    # Data only with percents

    Percent_Data <- subset(MergeData,select 
    = c("Year","Category","per_cent","per_pos"));

/ 我想知道是否可以将百分比数据叠加到我使用以前代码创建的图像上,以便可以一起显示数字和百分比。 /

3 个答案:

答案 0 :(得分:0)

我想你差不多了。 使用MergeData作为数据框的来源,再添加一个geom_text

的来电
p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
 geom_bar(aes(fill = Category), stat="identity") +
 geom_text(aes(label=Frequency,y = pos), size = 3, vjust = 1) +  
 geom_text(
        aes(y = max_pos, label = max_pos), size = 4,
        vjust = -.5) + 
 geom_text(aes(x = Year, y = pos, label = per_cent), vjust = -1, size = 4)

  print(p);

您可能需要摆弄hjustvjust以获得您喜欢的文字。

答案 1 :(得分:0)

感谢您的回复。我觉得这很好。

p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
     geom_bar(aes(fill = Category), stat="identity") +
     geom_text(aes(label=Frequency,y = pos),  vjust = 1,size = 2,hjust = 0.5) +  
     geom_text(aes(y = max_pos, label = max_pos), size = 3,vjust = -.1) + 
     geom_text(aes(x = Year, y = pos, label = per_cent), vjust = -.4, size = 2)+
     xlab("Year") + ylab(" Number of People") +            # Set axis labels
     ggtitle("Distribution by Category over Year") +  # Set title
     theme(panel.background = 
     element_rect(fill = 'white', colour = 'white'),
     legend.position = "bottom" ,
     legend.title = element_text(color="black",
     size=7),
     legend.key.width = unit(1,"inch") );

 print(p);

现在我的百分比在数字数字之上,换句话说,它是&#34; 17%&#34;和&#34; 168&#34;但我想&#34; 168&#34;和&#34; 17%&#34;。我尝试切换geom_text()的位置,但它没有用。我想知道你是否知道如何解决它。

答案 2 :(得分:0)

是的,它有所帮助。我固定数字,使每个堆栈的中心。因此,我需要在代码下面修改我的问题。非常感谢你的帮助。

p <- ggplot(MergeData, aes(x = Year, y = Frequency, group = Category)) +
     geom_bar(aes(fill = Category), stat="identity") +
     geom_text(aes(label=Frequency,y = pos),  vjust = 1,
     size = 2,hjust = 0.5) +  
     geom_text(aes(y = max_pos, label = max_pos), size = 3,vjust = -.1) + 
     geom_text(aes(x = Year, y = pos, label = per_cent), vjust = 1.95, 
     size = 2,hjust=0.3)+
     xlab("Year") + ylab(" Number of People") +            # Set axis labels
     ggtitle("Distribution by Category over Year") +       # Set title;
      theme(panel.background = 
    element_rect(fill = 'white', colour = 'white'),
    legend.position = "bottom" ,
    legend.title = element_text(color="black",
    size=7) );
 print(p);