基于条件的分布抽样

时间:2019-07-15 07:46:49

标签: r if-statement sampling

我正在为学生练习假设检验的数据集。该数据应包含生产建筑设备车辆的虚拟处理时间。车辆有不同的类型和不同的选择,可能影响处理时间。根据处理时间和机器规格,学生将研究哪些因素对处理时间有重大影响,并预测生产具有特定配置的特定机器所需的时间。

数据集的最终目标是生成每台机器的总处理时间。本质上,(总)处理时间应该是基本时间+选项1时间+选项2时间+选项3时间+等等的累积。每个选项都应从分布中随机抽样,以免过于明显。仅将总时间提供给学生,但是我需要选择时间来构建总时间。

我知道如何使用rnorm()和其他分布进行随机采样。但是我不知道如何仅根据列的内容有条件地生成数据。

数据集看起来像这样。

Machine                  <-   c(1,2,3,4,5,6,7,8,9,10)
Pump.Option              <-   c("30 Liter", "40 Liter", "30 Liter", "30 Liter", "30 Liter", "30 Liter", "50 Liter", "30 Liter", "30 Liter", "40 Liter")
Piping.Option            <-   c("No special piping", "No special piping", "special piping", "No special piping", "special piping", "No special piping", "No special piping", "special piping", "special piping", "No special piping")
Lights.Option            <-   c("Std light", "Std & Addional", "Std & Addional","Std & Addional", "Std & Addional", "Std & Addional", "Std light", "Std & Addional", "Std & Addional", "Std & Addional")
Valve.Option             <-   c("Safety valve", "Safety valve", "Normal valve", "Normal valve", "Safety valve", "Normal valve", "Safety valve", "Safety valve", "Normal valve", "Safety valve")
Pump.Time                <-   NA       
Piping.Time              <-   NA
Lights.Time              <-   NA
Valve.Time               <-   NA
Total.Time               <-   NA


DF.Sample                <- data.frame(Machine, Pump.Option, Piping.Option, Lights.Option, Valve.Option, Pump.Time, Piping.Time, Lights.Time, Valve.Time, Total.Time)

基于列Pump.Option,Piping.Option和Lights.Option的内容,需要生成的时间是Pump.Time,Piping.Time和Lights.Time。这些时间将用于计算该机器的总时间。

选项的时间是这样的。

  • 泵送时间
    • 30升(无需额外时间)
    • 40升(平均10分钟,标准差4分钟)
    • 50升(平均20分钟,标准差10分钟)
  • 管道时间
    • 无需特殊管道(无需额外时间)
    • 专用管道系统(平均10分钟,标准偏差4分钟)
  • Lights.Option
    • 标准灯(无需额外时间)
    • 标准和附加(平均10分钟,标准差4分钟)

1 个答案:

答案 0 :(得分:0)

您可以为此使用dplyr的case_when,与一组嵌套的ifelse语句相比,它提供了一种相对干净的语法:

library(dplyr)

DF.Sample %>%
    mutate(Pump.Time = case_when(
            Pump.Option == "30 Liter" ~ 0,        
            Pump.Option == "40 Liter" ~ rnorm(n(), mean = 10, sd = 4),
            Pump.Option == "50 Liter" ~ rnorm(n(), mean = 20, sd = 10)
        ), 
        Piping.Time = case_when(
           Piping.Option == "No special piping" ~ 0, 
           Piping.Option == "special piping" ~ rnorm(n(), mean = 10, sd = 4)
        ),
        Lights.Time = case_when(
           Lights.Option == "Std light" ~ 0,
           Lights.Option == "Std & Additional" ~ rnorm(n(), mean = 10, sd = 4)
        )
    )
#>    Machine Pump.Option     Piping.Option    Lights.Option Valve.Option
#> 1        1    30 Liter No special piping        Std light Safety valve
#> 2        2    40 Liter No special piping Std & Additional Safety valve
#> 3        3    30 Liter    special piping Std & Additional Normal valve
#> 4        4    30 Liter No special piping Std & Additional Normal valve
#> 5        5    30 Liter    special piping Std & Additional Safety valve
#> 6        6    30 Liter No special piping Std & Additional Normal valve
#> 7        7    50 Liter No special piping        Std light Safety valve
#> 8        8    30 Liter    special piping Std & Additional Safety valve
#> 9        9    30 Liter    special piping Std & Additional Normal valve
#> 10      10    40 Liter No special piping Std & Additional Safety valve
#>    Pump.Time Piping.Time Lights.Time
#> 1   0.000000    0.000000    0.000000
#> 2   4.956528    0.000000   17.716970
#> 3   0.000000   11.051394   10.142101
#> 4   0.000000    0.000000   11.886158
#> 5   0.000000   15.291671    6.745524
#> 6   0.000000    0.000000    5.228694
#> 7  21.520437    0.000000    0.000000
#> 8   0.000000    9.777887    9.222347
#> 9   0.000000   11.219067   14.726647
#> 10 12.761031    0.000000    6.111458

数据

DF.Sample <- data.frame(
    Machine = c(1,2,3,4,5,6,7,8,9,10), 
    Pump.Option = c("30 Liter", "40 Liter", "30 Liter", "30 Liter", "30 Liter", "30 Liter", "50 Liter", "30 Liter", "30 Liter", "40 Liter"),
    Piping.Option = c("No special piping", "No special piping", "special piping", "No special piping", "special piping", "No special piping", "No special piping", "special piping", "special piping", "No special piping"),
    Lights.Option = c("Std light", "Std & Additional", "Std & Additional","Std & Additional", "Std & Additional", "Std & Additional", "Std light", "Std & Additional", "Std & Additional", "Std & Additional"),
    Valve.Option = c("Safety valve", "Safety valve", "Normal valve", "Normal valve", "Safety valve", "Normal valve", "Safety valve", "Safety valve", "Normal valve", "Safety valve")
)