我正在尝试根据R中的分组变量生成一个新列。
我有一个测试结果的数据框,其中每个学生的年终考试结果都在一行。
学生分为A和B两组.A组第一学期是化学,第二学期是英语,B组是另一回事。考试结果全部来自年底。
ID Group English Chemistry
1 A 9 4
2 B 7 3
3 B 7 6
4 A 3 10
etc
我想知道教学顺序是否会导致考试成绩的差异,所以我需要一个名为Sem1的专栏,其中包括A组的化学考试和B组的英语,另一个名为Sem2的专栏包括A组的英语和B组。化学
所以它看起来像这样:
ID Group English Chemistry Sem1 Sem2
1 A 9 4 4 9
2 B 7 3 3 7
3 B 7 6 6 6
4 A 3 10 10 3
etc
然后,我可以根据学期进行统计。我怀疑这并不难,但我很简单。所有帮助表示赞赏!
答案 0 :(得分:2)
您可以使用ifelse
和mutate
。
require(tidyverse)
#Sample data
df <- data.frame(ID = c(1:4),
Group = c("A", "B", "B", "A"),
English = c(9, 7, 7, 3),
Chemistry = c(4, 3, 6, 10))
df %>%
mutate(Sem1 = ifelse(Group == "A", Chemistry, English),
Sem2 = ifelse(Group == "A", English, Chemistry))
结果:
ID Group English Chemistry Sem1 Sem2
1 1 A 9 4 4 9
2 2 B 7 3 7 3
3 3 B 7 6 7 6
4 4 A 3 10 10 3
case_when
并为ifelse
,case_when
和transform
使用相同的样本数据,您也可以使用dplyr::case_when()
。
df %>%
mutate(Sem1 = case_when(Group == "A" ~ Chemistry,
Group == "B" ~ English),
Sem2 = case_when(Group == "A" ~ English,
Group == "B" ~ Chemistry))
然而,包括@Maurits Evers回答谁使用基础R transform
,我想知道哪个是最快的。
新样本数据
df <- data.frame(ID = c(1:100),
Group = rep(sample(c("A", "B"), replace = TRUE), 100),
English = rnorm(100, mean = 85, sd = 10),
Chemistry = rnorm(100, mean = 85, sd = 10))
基准:
require(rbenchmark)
benchmark("ifelse" = {df %>%
mutate(Sem1 = ifelse(Group == "A", Chemistry, English),
Sem2 = ifelse(Group == "A", English, Chemistry))
},
"case_when" = {
df %>%
mutate(Sem1 = case_when(Group == "A" ~ Chemistry,
Group == "B" ~ English),
Sem2 = case_when(Group == "A" ~ English,
Group == "B" ~ Chemistry))
},
"transform" = {
transform(
df,
Sem1 = ifelse(Group == "A", Chemistry, English),
Sem2 = ifelse(Group == "A", English, Chemistry))
},
replications = 1000,
columns = c("test", "replications", "elapsed",
"relative", "user.self", "sys.self"))
结果:
test replications elapsed relative user.self sys.self
2 case_when 1000 2.18 4.449 2.11 0.01
1 ifelse 1000 1.58 3.224 1.57 0.00
3 transform 1000 0.49 1.000 0.48 0.00
答案 1 :(得分:0)
假设您在预期输出的第3行中出错,这里是使用transform(
df,
Sem1 = ifelse(Group == "A", Chemistry, English),
Sem2 = ifelse(Group == "A", English, Chemistry))
# ID Group English Chemistry Sem1 Sem2
#1 1 A 9 4 4 9
#2 2 B 7 3 7 3
#3 3 B 7 6 7 6
#4 4 A 3 10 10 3
的基本R解决方案:
df <- read.table(text =
"ID Group English Chemistry
1 A 9 4
2 B 7 3
3 B 7 6
4 A 3 10", header = T)
{
"Request": {
"ReturnPolicyIdList": true,
"AccessSubject": {
"Attribute": [
{
"AttributeId": "urn:oasis:names:tc:xacml:1.0:subject:subject-id",
"Value": "Julius Hibbert"
}
]
},
"Resource": {
"Attribute": [
{
"AttributeId": "urn:oasis:names:tc:xacml:1.0:resource:resource-id",
"Value": "http://medico.com/record/patient/BartSimpson",
"DataType": "http://www.w3.org/2001/XMLSchema#anyURI"
}
]
},
"Action": {
"Attribute": [
{
"AttributeId": "urn:oasis:names:tc:xacml:1.0:action:action-id",
"Value": "read"
}
]
},
"Environment": {
"Attribute": []
}
}
}