以下是原始数据框的样子:
PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00
如何创建一个显示SATE按PLACEMENT百分比的列?
我希望它最终看起来像这样:
PLACEMENT SIZE COST PERCENTAGE
1 placement1 LARGE 1838128.00 11.9
58 placement1 MEDIUM 10962048.00 71.1
117 placement1 SMALL 2622851.00 17.0
175 placement1 UNKNOWN 443.00 0.0
2 placement2 LARGE 598.00 0.1
59 placement2 MEDIUM 24358.00 4.07
118 placement2 SMALL 571802.00 95.54
176 placement2 UNKNOWN 1706.00 0.29
3 placement3 LARGE 8.00 0.0
60 placement3 MEDIUM 22.00 0.0
119 placement3 SMALL 502388.00 99.84
177 placement3 UNKNOWN 762.00 0.16
任何帮助都会很棒,谢谢!我无法弄清楚prop.table库,即使我有一种我应该使用的感觉。
答案 0 :(得分:2)
你可以使用dplyr快速完成:
library(dplyr)
df <- df %>% group_by(PLACEMENT) %>% mutate(PERCENTAGE=COST/SUM(COST))
看起来您想要的结果也是四舍五入的,如果您愿意,可以使用圆函数()来完成。
编辑如果您希望将百分比保持在1到100之间,那么您当然可以通过编写100 * COST / SUM(COST)代替,如果您喜欢这样做的话。
答案 1 :(得分:1)
假设您的数据框输入为DF
,则可以执行此操作。不需要包裹。
transform(DF, PC = 100 * ave(COST, PLACEMENT, FUN = prop.table))
,并提供:
PLACEMENT SIZE COST PC
1 placement1 LARGE 1838128 11.917733169
58 placement1 MEDIUM 10962048 71.073811535
117 placement1 SMALL 2622851 17.005583050
175 placement1 UNKNOWN 443 0.002872246
2 placement2 LARGE 598 0.099922468
59 placement2 MEDIUM 24358 4.070086087
118 placement2 SMALL 571802 95.544928350
176 placement2 UNKNOWN 1706 0.285063095
3 placement3 LARGE 8 0.001589888
60 placement3 MEDIUM 22 0.004372193
119 placement3 SMALL 502388 99.842601057
177 placement3 UNKNOWN 762 0.151436862
注意:可重复形式的输入是:
Lines <- "PLACEMENT SIZE COST
1 placement1 LARGE 1838128.00
58 placement1 MEDIUM 10962048.00
117 placement1 SMALL 2622851.00
175 placement1 UNKNOWN 443.00
2 placement2 LARGE 598.00
59 placement2 MEDIUM 24358.00
118 placement2 SMALL 571802.00
176 placement2 UNKNOWN 1706.00
3 placement3 LARGE 8.00
60 placement3 MEDIUM 22.00
119 placement3 SMALL 502388.00
177 placement3 UNKNOWN 762.00"
DF <- read.table(text = Lines, header = TRUE)
答案 2 :(得分:0)
以下是使用data.table
library(data.table)
setDT(df)[, PERCENTAGE := COST/SUM(COST) , by = PLACEMENT]