使用前一个变量创建新变量

时间:2021-05-18 11:32:55

标签: r count sum mutate

嗨,我正在学习如何使用 R 语言。我有一个数据集 (df),它有 3 个分类变量(会话、ID、评估)。我正在尝试创建一个新变量(Total Assessed),它由每个会话的 Assessed 列中“Y”的数量的总和/计数组成。我尝试使用命令 count、sum、filter、mutate 来尝试将新变量“Total Assessed”添加到我的数据框中,但没有达到我想要的结果。这个 image 显示了我想要的结果。(它也是它的样子,但它没有最后一个变量)。你能帮我么? 下面是我尝试使用的命令,但由于这个或其他原因没有奏效...我觉得我接近答案了,但使用的命令顺序错误或遗漏了一步。

> df %>% group_by(Session) %>% filter(Assessed == "Y") # it didn't gave me the count of "Y"
> df <- df %>% group_by(Session, Assessed) %>% filter(Assessed == "Y") # it didn't gave me the count of "Y"

> df <- df %>% group_by(Session, Assessed) %>% filter(Assessed == "Y") %>% tally() # This was close, beacuse it counted the "Y" per session. However, it completly ignored sessions that only had "N". I need this sessions to appear has "0" in "Total Assessed". 

1 个答案:

答案 0 :(得分:0)

找到了!

诀窍是暂时将评估变量转换为数字变量。然后使用 group_by、mutate 和 sum 命令(我不确定,但我认为前两个属于 dplyr 包)。然后我可以再次将评估者恢复为角色。

> df$Assessed <- as.numeric(df$Assessed)
> df %>% df <- group_by (Session) %>% mutate (Total Assessed = sum(Assessed))
> df$Assessed <- as.character(df$Assessed)