我的数据框看起来像这样 -
Numerator Denominator Proportion StudyQaurter NewPatGroup Measure
120 320 0.37 1 A&B/A&B&C ExposedDays/PatientDays
以及Columns' PatGroup'中的许多这样的变量组合。和'变量'
我想要一个能让我从列#Pat;'中选择条目组合的功能。以及来自“变量”列的条目组合获得所需的输出。 例如,我想计算一个比例,该比例计算变量ExposedDays作为Numerator的PatGroups A和B的值之和;和变量ExposedDays和PatientDays作为分母的PatGroups A,B和C.
输出看起来像 -
{{1}}
有人可以帮我这个吗?
答案 0 :(得分:1)
说实话,我不确定以你提议的方式汇总数据有什么意义,但你可以这样做:
library(tidyverse);
df %>%
group_by(StudyQuarter) %>%
summarise(
Numerator = sum(Value[Variable == "ExposedDays" & PatGroup %in% c("A", "B")]),
Denominator = sum(Value[Variable %in% c("ExposedDays", "PatientDays") & PatGroup %in% c("A", "B", "C")]),
Proportion = Numerator / Denominator,
NewPatGroup = "A&B/A&B&C",
Measure = "ExposedDays/PatientDays")
## A tibble: 2 x 6
# StudyQuarter Numerator Denominator Proportion NewPatGroup Measure
# <int> <int> <int> <dbl> <chr> <chr>
#1 1 120 320 0.375 A&B/A&B&C ExposedDays/Patien…
#2 2 90 110 0.818 A&B/A&B&C ExposedDays/Patien…
df <- read.table(text =
"PatGroup Variable Value StudyQuarter
A PatientDays 100 1
B ExposedDays 80 1
A ExposedDays 40 1
A Patients 40 1
C ExposedDays 10 1
C PatientDays 90 1
A PatientDays 20 2
B ExposedDays 90 2", header = T)