使用组变量

时间:2018-05-14 12:03:42

标签: r matrix grouping

我正在尝试根据R中的分组变量生成一个新列。

我有一个测试结果的数据框,其中每个学生的年终考试结果都在一行。

学生分为A和B两组.A组第一学期是化学,第二学期是英语,B组是另一回事。考试结果全部来自年底。

ID Group English Chemistry
1    A      9     4
2    B      7     3
3    B      7     6
4    A      3     10
etc

我想知道教学顺序是否会导致考试成绩的差异,所以我需要一个名为Sem1的专栏,其中包括A组的化学考试和B组的英语,另一个名为Sem2的专栏包括A组的英语和B组。化学

所以它看起来像这样:

ID Group English Chemistry   Sem1  Sem2
1    A      9     4          4       9
2    B      7     3          3       7
3    B      7     6          6       6
4    A      3     10         10      3
etc
然后,我可以根据学期进行统计。我怀疑这并不难,但我很简单。所有帮助表示赞赏!

2 个答案:

答案 0 :(得分:2)

您可以使用ifelsemutate

require(tidyverse)

#Sample data
df <- data.frame(ID = c(1:4), 
                 Group = c("A", "B", "B", "A"), 
                 English = c(9, 7, 7, 3), 
                 Chemistry = c(4, 3, 6, 10))

df %>% 
  mutate(Sem1 = ifelse(Group == "A", Chemistry, English), 
         Sem2 = ifelse(Group == "A", English, Chemistry))

结果:

  ID Group English Chemistry Sem1 Sem2
1  1     A       9         4    4    9
2  2     B       7         3    7    3
3  3     B       7         6    7    6
4  4     A       3        10   10    3

修改 - 建议case_when并为ifelsecase_whentransform

做基准测试

使用相同的样本数据,您也可以使用dplyr::case_when()

df %>% 
        mutate(Sem1 = case_when(Group == "A" ~ Chemistry, 
                                Group == "B" ~ English),
               Sem2 = case_when(Group == "A" ~ English,
                                Group == "B" ~ Chemistry))

然而,包括@Maurits Evers回答谁使用基础R transform,我想知道哪个是最快的。

新样本数据

df <- data.frame(ID = c(1:100), 
                 Group = rep(sample(c("A", "B"), replace = TRUE), 100), 
                 English = rnorm(100, mean = 85, sd = 10), 
                 Chemistry = rnorm(100, mean = 85, sd = 10))

基准:

require(rbenchmark) 

benchmark("ifelse" = {df %>% 
    mutate(Sem1 = ifelse(Group == "A", Chemistry, English), 
           Sem2 = ifelse(Group == "A", English, Chemistry))
},
"case_when" = {
  df %>% 
    mutate(Sem1 = case_when(Group == "A" ~ Chemistry, 
                            Group == "B" ~ English),
           Sem2 = case_when(Group == "A" ~ English,
                            Group == "B" ~ Chemistry))
},
"transform" = {
  transform(
    df, 
    Sem1 = ifelse(Group == "A", Chemistry, English), 
    Sem2 = ifelse(Group == "A", English, Chemistry))
},
replications = 1000,
columns = c("test", "replications", "elapsed",
            "relative", "user.self", "sys.self")) 

结果:

       test replications elapsed relative user.self sys.self
2 case_when         1000    2.18    4.449      2.11     0.01
1     ifelse         1000    1.58    3.224      1.57     0.00
3 transform         1000    0.49    1.000      0.48     0.00

答案 1 :(得分:0)

假设您在预期输出的第3行中出错,这里是使用transform( df, Sem1 = ifelse(Group == "A", Chemistry, English), Sem2 = ifelse(Group == "A", English, Chemistry)) # ID Group English Chemistry Sem1 Sem2 #1 1 A 9 4 4 9 #2 2 B 7 3 7 3 #3 3 B 7 6 7 6 #4 4 A 3 10 10 3 的基本R解决方案:

df <- read.table(text =
    "ID Group English Chemistry
1    A      9     4
2    B      7     3
3    B      7     6
4    A      3     10", header = T)

样本数据

{
    "Request": {
        "ReturnPolicyIdList": true,
        "AccessSubject": {
            "Attribute": [
                {
                    "AttributeId": "urn:oasis:names:tc:xacml:1.0:subject:subject-id",
                    "Value": "Julius Hibbert"
                }
            ]
        },
        "Resource": {
            "Attribute": [
                {
                    "AttributeId": "urn:oasis:names:tc:xacml:1.0:resource:resource-id",
                    "Value": "http://medico.com/record/patient/BartSimpson",
                    "DataType": "http://www.w3.org/2001/XMLSchema#anyURI"
                }
            ]
        },
        "Action": {
            "Attribute": [
                {
                    "AttributeId": "urn:oasis:names:tc:xacml:1.0:action:action-id",
                    "Value": "read"
                }
            ]
        },
        "Environment": {
            "Attribute": []
        }
    }
}