基于R

时间:2017-08-20 21:20:16

标签: r dataframe

我正在执行涉及两个数据帧的计算。我创建了两个数据帧的两个可重现的示例作为示例

> df1 Day1 Day2 Day3 Day4 Day5 Day6 Day7 Day8 Day9 Day10 Time1 0.03 0.43 0.39 0.41 0.94 0.70 0.18 0.65 0.72 0.72 Time2 0.42 0.63 0.93 0.53 0.19 0.55 0.22 0.16 0.56 0.04

> df2 Day Time X3 X4 X5 1 1 1 9.252042 19.512621 11.601671 2 1 2 5.021522 17.712484 5.044728 3 2 1 9.603795 19.404302 17.206771 4 2 2 19.686793 18.791541 12.655874 5 3 1 7.546551 18.810526 19.865979 6 3 2 18.233872 19.596584 11.653980 7 4 1 17.499680 14.014276 15.553013 8 4 2 8.115352 17.898786 12.841630 9 5 1 10.719540 8.518823 19.126440 10 5 2 12.853401 6.026599 14.041490 11 6 1 19.984946 10.693528 6.890835 12 6 2 16.360035 15.778092 18.087471 13 7 1 15.498714 15.039444 5.259257 14 7 2 13.179111 17.533358 7.382507 15 8 1 5.124188 15.507194 12.547365 16 8 2 8.008336 10.463382 6.934014 17 9 1 11.246527 6.975527 14.464758 18 9 2 17.914083 18.039384 19.324091 19 10 1 9.876625 19.216317 8.787550 20 10 2 11.851955 15.729080 5.741095

df1中的列表示记录值的天数,行数表示小时/或时间(时间1或2)。在df2中,前两列分别代表日期和时间,其他列代表记录数据的位置。

我想用R做的是创建另一个与df2大小相同的数据框,它将df2 [,3:5]中的值除以相应的df1值,即取决于当天的值和df2的时间列,选择df1的相应值。 例如,对于df2 $ X3的第一个值,在新数据框中,我将得到值9.252042除以0.03。对于df2 $ X3的第三个值,我的值为9.603795除以0.43。

提前感谢您的帮助!

3 个答案:

答案 0 :(得分:1)

我认为您将数据(df1df2)设为如下所示:

df1 = data.frame(time=c(1:10),time1=c(0.03,0.43,0.39,0.41,.94,.70,.18,.065,0.72,0.72),time2 = c(.42,.63,.93,.53,.19,.55,.22,.16,.56,.04))
df2 = data.frame(Day = rep(c(1:10),each=2),Time = rep(c(1,2),10),X3=c(9.2,5.02,9.6,19.6,7.5,18.2,17.4,8.1,10.7,12.8,19.9,16.3,15.4,13.1,5.1,8,11.2,17.9,9.8,11.8),X4=c(19.5,17.7,19.4,18.8,18,19.5,14.01,17.8,8.5,6,10.6,15.7,15,17.5,15,10,6,18,19,15),X5=c(11.6,5,17,12,19,11,15,12,19,14,6,18,5,7,12,6,14,19,8,5))

然后,您将要创建df3的新代码将是:

df3 = data.frame(df2$Day,df2$Time,newx3 = df2$X3 / df1$time[df2$Day],newx4 = df2$X4 / df1$time[df2$Day],newx5 = df2$X5 / df1$time[df2$Day])

答案 1 :(得分:0)

我的建议是遵循整洁的数据原则

在这里,我提供的示例与您的数据框架具有相同的结构,但更简化,仅适用于第1-3天:

library(dplyr)
library(tidyr)

untidy = tibble(day1 = c(0.03,0.42), day2 = c(0.43,0.63), day3 = c(0.39,0.93))

tidy = tibble(day = c(1,1,2,2,3,3), time = c(1,2,1,2,1,2), val1 = c(9.252042,5.012522,9.603795,19.686793,7.546551,18.233872))

untidy_to_tidy = untidy %>% 
  gather(day,val2) %>% 
  mutate(day = as.double(gsub("day","",day)),
    time = rep(c(1,2), (ncol(untidy) * nrow(untidy))/2)) %>% 
  select(day,time,val2)

tidy %>% 
  left_join(untidy_to_tidy, by = c("day","time")) %>% 
  mutate(division = val1 / val2)

如果您是R的新手,请保持简单并做到这一点:

  1. 使用read_csv("YOUR_FILE.CSV")

  2. 中的readr向您宣读CSV / TSV / etc
  3. 在我的示例中替换

    untidy = tibble(day1 = c(0.03,0.42),day2 = c(0.43,0.63),day3 = c(0.39,0.93))

  4. 通过

    untidy = read_csv("YOUR_FILE.CSV")
    

    tidy = tibble(day = c(1,1,2,2,3,3), time = c(1,2,1,2,1,2), val1 = c(9.252042,5.012522,9.603795,19.686793,7.546551,18.233872))
    

    通过

    tidy = read_csv("YOUR_OTHER_FILE.CSV")
    

答案 2 :(得分:0)

您需要做的是小心:您的两个数据帧按照甜蜜的顺序排列。代码如下:

 df2[3:5]/unlist(df1)

            X3         X4        X5
 1  308.401400 650.420700 386.72237
 2   11.956005  42.172581  12.01126
 3   22.334407  45.126284  40.01575
 4   31.248878  29.827843  20.08869
 5   19.350131  48.232118  50.93841 
 6   19.606314  21.071596  12.53116
 :        :         :          :
 :        :         :          :