Question

我有一个问题与昨天的另一篇文章有关： R finding the first value in a data frame that falls within a given threshold。

根据之前的帖子，我的数据帧随着时间的推移具有光密度（OD）：

time    OD
446     0.0368
446.5   0.0353
447     0.0334
447.5   0.032
448     0.0305
448.5   0.0294
449     0.0281
449.5   0.0264
450     0.0255
450.5   0.0246
451     0.0238
451.5   0.0225
452     0.0211
452.5   0.0199
453     0.0189
453.5   0.0175

我有OD的上限和下限阈值，我需要找到数据中超出这些值的时间。

此代码在超出上限或下限时找到;例如，我在寻找超过下限的时间：

library(dplyr)

find_time = function(df, threshold){
  return_value = df %>%
    arrange(time) %>%
    filter(OD < threshold) %>%
    slice(1)
  return(return_value)
}

find_time(data, threshold)

在超出阈值时返回time和OD：

  time     OD
  <dbl>  <dbl>
   446 0.0368

但是，我需要知道何时达到上限（0.5033239）和下限（-0.3695971），因此我将代码修改为：

find_time = function(df, threshold_1, threshold_2){
  return_value_1 = df %>%
    arrange(time) %>%
    filter(OD > threshold_1) %>%
    slice_(1)

  return_value_2 = df %>%
        arrange(time) %>%
        filter(OD < threshold_2) %>%
        slice_(1)

  return(data.frame(return_value_1, return_value_2))
}

当我运行代码时，我得到两个错误之一：

Error in data.frame(return_value, return_value_2) : 
  **arguments imply differing number of rows: 1, 0**
Called from: data.frame(return_value, return_value_2)

或错误：

[1] time   OD     time.1 OD.1  
<0 rows> (or 0-length row.names)

这些错误似乎是由于某些研究对象OD数据永远不会达到定义的上限/下限这一事实。

我需要在函数中使用if语句，这样当找不到上限或下限之一时返回＆＃34; null＆＃34;，但也给了我另一个的值（即，如果未达到upper threshold，则返回null，同时为time提供OD和lower threshold。

我试过了，但很明显我做错了：

find_time = function(df, threshold_1, threshold_2){
  return_value_1 = df %>%
    arrange(time) %>%
    filter(OD > threshold_1) %>%
    slice_(1)

  **if(OD > threshold_1){
    print(return_value_1)
  } else {
    print("NULL")
  }**


  return_value_2 = df %>%
    arrange(time) %>%
    filter(OD < threshold_2) %>%
    slice_(1)

  **if(OD < threshold_2){
    print(return_value_2)
  } else {
    print("NULL")
  }**

  return(data.frame(return_value_1, return_value_2))
}

也尝试过：

find_time = function(df, threshold_1, threshold_2, OD){
  return_value_1 = df %>%
    arrange(time) %>%
   {(if (OD > threshold_1)
    else filter(OD < threshold_2)} %>%
    slice_(1))


  return(data.frame(return_value_1))
}

但我明白了：

Error in UseMethod("filter_") : 
  no applicable method for 'filter_' applied to an object of class "logical"
In addition: Warning message:
In if (OD == "") filter(OD > threshold_1) %>% slice_(1) else filter_(OD <  :
  the condition has length > 1 and only the first element will be used

Answer 1

我认为关于逻辑和布尔运算符的一些基本背景有助于解决这个问题。有一些子弹：

＆＃34; TRUE＆＃34;和＆＃34;错误＆＃34;
＆＃34;和＆＃34;，＆＃34;或＆＃34;，＆＃34;不＆＃34;
如果是＆＃34;或＆＃34;

例如，如果您有一个数字列表：

toy_list = c(1,3,5,66,100)

<强> 1。真与假

第一个问题：列表中的数字是1？

> 1 %in% toy_list
[1] TRUE

第二个问题：列表中的数字是100000吗？

> 10000 %in% toy_list
[1] FALSE

<强> 2。和/或不。

第一个问题：列表中的数字1和10000是什么？

> 1 %in% toy_list & 1000 %in% toy_list
[1] FALSE

第二个问题：列表中的数字1还是10000？

> 1 %in% toy_list | 1000 %in% toy_list
[1] TRUE

第三个问题：数字1不在列表中吗？

> !(1 %in% toy_list)
[1] FALSE

第3。 if，else

第一个问题：如果列表中有1，则打印为true，否则打印为false。

> if (1 %in% toy_list){ print("TRUE")} else {print("FALSE")}
[1] "TRUE

第二个问题：如果列表中有100000，则打印为true，否则打印为false。

> if (100000 %in% toy_list){ print("TRUE")} else {print("FALSE")}
[1] "FALSE"

第三个问题：如果列表中有1和100000，则打印为true，否则打印为false。

> if (1 %in% toy_list & 100000 %in% toy_list  ){ print("TRUE")} else {print("FALSE")}
[1] "FALSE"

第四个问题：如果列表中有1或100000，则打印为true，否则打印为false。

> if (1 %in% toy_list | 100000 %in% toy_list  ){ print("TRUE")} else {print("FALSE")}
[1] "TRUE"

<强> 4。回到你的问题（最后）

如果要过滤低于低阈值且低于高阈值的数字，您需要做的是：

filter(OD > threshold_low & OD < threshold_high)

R添加一个if.else语句来函数返回＆＃34; null＆＃34;

1 个答案: