我有一个如下所示的数据集,我想按州计算频率百分比。
数据
# State Ideology Freq
#1 CO Conservative 33
#2 CO Independent 17
#3 CO Liberal 50
#4 DC Conservative 33
#5 DC Independent 33
#6 DC Liberal 33
预期产出:
# State Ideology Freq percentage
#1 CO Conservative 33 33%
#2 CO Independent 17 17%
#3 CO Liberal 50 50%
#4 DC Conservative 33 33.33%
#5 DC Independent 33 33.33%
#6 DC Liberal 33 33.33%
尝试:
data$percentage = data$Freq/sum(data$Freq)
percent <- function(x, digits = 2, format = "f", ...) {
paste0(formatC(100 * x, format = format, digits = digits, ...), "%")
}
data$percentage = percent(data$percentage)
我能够按总体水平计算百分比,但我想用freqvalue / sum(状态的频率值)来计算百分比。
答案 0 :(得分:0)
您可以使用dplyr
包:
library(dplyr)
data <- group_by(data, State) %>%
mutate(percentage = paste0(round(Freq/sum(Freq) * 100, 2), "%"))
data
## Source: local data frame [6 x 4]
## Groups: State [2]
##
## State Ideology Freq percentage
## (fctr) (fctr) (int) (chr)
## 1 CO Conservative 33 33%
## 2 CO Independent 17 17%
## 3 CO Liberal 50 50%
## 4 DC Conservative 33 33.33%
## 5 DC Independent 33 33.33%
## 6 DC Liberal 33 33.33%
第一行按State
对数据进行分组。将针对每个组评估以下mutate()
中的所有操作。因此,sum(Freq)
总结了每个州Freq
的价值。
答案 1 :(得分:0)
library(dplyr)
groups <- group_by(data, State)
summary <- summarize( SUM.OF.STATE = sum(State ,na.rm = TRUE))
DF.YOU.WANT <- merge(data, summary, by.x = "State", by.y = "State")
# and now just divide columnt with freq by columnn from summary df with sum of freq.data is your data frame.