我有多列,我想找到其他列中一列的百分比是相同的。例如;
ST cd variable
1 1 23432
1 1 2345
1 2 908890
1 2 350435
1 2 2343432
2 1 9999
2 1 23432
所以我想做的是:
如果ST和cd相同,则使用相同的ST和cd查找该行的变量百分比。所以最后它看起来像:
ST cd variable percentage
1 1 23432 90.90%
1 1 2345 9.10%
1 2 908890 25.30%
1 2 350435 9.48%
1 2 2343432 65.23%
2 1 9999 29.91%
2 1 23432 70.09%
我怎样才能在R中这样做?
感谢您的帮助。
答案 0 :(得分:2)
您可以创建比例格式功能:
prop_format <-
function (x, digits=4)
{
x <- round(x/sum(x), digits)*100
paste0(x,'%')
}
然后使用ave
:
ave(dt$variable,list(dt$ST,dt$cd),FUN=prop_format)
[1] "90.9%" "9.1%" "25.23%" "9.73%" "65.05%" "29.91%" "70.09%"
答案 1 :(得分:1)
library(data.table)
DT <- data.table(read.table(text = "ST cd variable
1 1 23432
1 1 2345
1 2 908890
1 2 350435
1 2 2343432
2 1 9999
2 1 23432 ", header = TRUE))
DT[, percentage := variable / sum(variable) , by = list(ST, cd)]
## ST cd variable percentage
## 1: 1 1 23432 0.90902743
## 2: 1 1 2345 0.09097257
## 3: 1 2 908890 0.25227624
## 4: 1 2 350435 0.09726856
## 5: 1 2 2343432 0.65045519
## 6: 2 1 9999 0.29909366
## 7: 2 1 23432 0.70090634
答案 2 :(得分:1)
使用dplyr
:
require(dplyr)
df %>% group_by(ST, cd) %>% mutate(percentage = variable/sum(variable))
# ST cd variable percentage
#1 1 1 23432 0.90902743
#2 1 1 2345 0.09097257
#3 1 2 908890 0.25227624
#4 1 2 350435 0.09726856
#5 1 2 2343432 0.65045519
#6 2 1 9999 0.29909366
#7 2 1 23432 0.70090634
您可以根据需要修改此项:
dd %>% group_by(ST, cd) %>% mutate(percentage = round(variable/sum(variable)*100, 2))
# ST cd variable percentage
#1 1 1 23432 90.90
#2 1 1 2345 9.10
#3 1 2 908890 25.23
#4 1 2 350435 9.73
#5 1 2 2343432 65.05
#6 2 1 9999 29.91
#7 2 1 23432 70.09