如何使用R中的其他变量计算1变量的百分比

时间:2016-03-01 19:37:41

标签: r

我有一个如下所示的数据集,我想按州计算频率百分比。

数据

#    State     Ideology Freq
#1    CO Conservative   33
#2    CO  Independent   17
#3    CO      Liberal   50
#4    DC Conservative   33
#5    DC  Independent   33
#6    DC      Liberal   33

预期产出:

 #    State     Ideology Freq percentage
 #1    CO Conservative   33   33%
 #2    CO  Independent   17   17%
 #3    CO      Liberal   50   50%
 #4    DC Conservative   33   33.33%
 #5    DC  Independent   33   33.33%
 #6    DC      Liberal   33   33.33%

尝试:

data$percentage = data$Freq/sum(data$Freq)  
percent <- function(x, digits = 2, format = "f", ...) {  
 paste0(formatC(100 * x, format = format, digits = digits, ...), "%")  
}  
data$percentage = percent(data$percentage)

我能够按总体水平计算百分比,但我想用freqvalue / sum(状态的频率值)来计算百分比。

2 个答案:

答案 0 :(得分:0)

您可以使用dplyr包:

library(dplyr)
data <- group_by(data, State) %>%
        mutate(percentage = paste0(round(Freq/sum(Freq) * 100, 2), "%"))
data
## Source: local data frame [6 x 4]
## Groups: State [2]
## 
##    State     Ideology  Freq percentage
##   (fctr)       (fctr) (int)      (chr)
## 1     CO Conservative    33        33%
## 2     CO  Independent    17        17%
## 3     CO      Liberal    50        50%
## 4     DC Conservative    33     33.33%
## 5     DC  Independent    33     33.33%
## 6     DC      Liberal    33     33.33%

第一行按State对数据进行分组。将针对每个组评估以下mutate()中的所有操作。因此,sum(Freq)总结了每个州Freq的价值。

答案 1 :(得分:0)

library(dplyr)
groups <- group_by(data, State)
summary <- summarize( SUM.OF.STATE = sum(State ,na.rm = TRUE))
DF.YOU.WANT <- merge(data, summary, by.x = "State", by.y = "State")
# and now just divide columnt with freq by columnn from summary df with sum of freq.data is your data frame.