根据R中的单元格值组合两个数据帧

时间:2015-07-28 15:08:23

标签: r merge

我有两个数据框。一个是不同测试类型的基线数据,另一个是我的实验数据。现在我想将这两个数据框组合在一起。但它不是一个简单的合并或rbind。我想任何R专业人士都可以帮我解决。谢谢。

以下是两个数据框的示例:

<?php
    $data["post"] = $_POST;
    $email = $_POST["data"]["email"];
    $password = $_POST["data"]["password"];

    if(empty($email)) {
        $errors['email'] = 'blank_email';
    }
    if(empty($password)) {
        $errors['password'] = 'blank_password';
    }
    echo json_encode($data);
?>

以下是我想要的输出:

experiment data:
experiment_num   timepoint     type     value
50                   10       7a,b4       90
50                   20       7a,b4       89
50                   20       10a,b4      93
50                   10       7a,b6       85
50                   20       7a,b6       87
50                   20       10a,b6      88 

baseline data:
experiment_num   timepoint      type    value
50                    0         0,b4      85
50                    0         0,b6      90

1 个答案:

答案 0 :(得分:1)

This should do the job. You first need to install a couple of packages:

install.packages("dplyr")
install.packages("tidyr")

* Data *

ed <- data.frame(experiment_num=rep(50, 6), timepoint=rep(c(10, 20, 20), 2), 
             type=c("7a,b4", "7a,b4", "10a,b4", "7a,b6", "7a,b6", "10a,b6"),
             value=c(90, 89, 93, 85, 87, 88))


db <- data.frame(experiment_num=rep(50, 2), timepoint=rep(0, 2), type=c("0,b4", "0,b6"),
             value=c(85, 90))

* Code *

library(tidyr)
library(dplyr)

final <- rbind(separate(ed, type, into=c("typea", "typeb")), 
           left_join(ed %>% select(type) %>% unique %>% 
           separate(type, into=c("typea", "typeb")),
           separate(db, type, into=c("zero", "typeb"))) %>% 
    select(experiment_num, timepoint, typea, typeb, value)
  ) %>% 
  arrange(typeb, typea, timepoint) %>% mutate(type=paste(typea, typeb, sep=",")) %>% 
  select(experiment_num, timepoint, type, value)

The logic is the following. Separate the type into two columns typea and typeb then "create" the missing typea for baseline data. and then join to the experimental data.

final is the data set you are looking for.