检查十进制值是否在R中的范围内

时间:2017-01-26 18:08:50

标签: r

我需要对代表各种疾病的代码进行重新分类,以便形成适当的组以供以后分析。

许多分组包括如下所示的范围:

try
{
    __android_log_print(ANDROID_LOG_ERROR, "nativeLib", "throw");
    throw new std::exception();
}
catch (...) {
    __android_log_print(ANDROID_LOG_ERROR, "nativeLib", "catch");
    env->ExceptionCheck();
}

其他人可能是1.0 to 1.5, 1.8 to 2.5, 3.0

最初我认为这样的事情可能有用:

37.0

问题是,x <-c(0:.9, 1.9:2.9, 7.9:8.9, 4.0:4.9, 3:3.9, 5:5.9, 6:6.9, 11:11.9, 9:9.9, 10:10.9, 12.9, 13:13.9, 14,14.2, 14.8) df$disease_cat[df$site_code %in% x] <- "disease a" 等未被识别为0.1,0.2范围内。

我现在明白r中的0:0.9(例如)实际上是5:10

对这些间隔进行编码的更好方法是什么,以便小数点被识别为位于5,6,7...100的间隔? (请记住,会有很多&#34; mini&#34;范围以及明确编码它们的想法并不特别吸引人)

3 个答案:

答案 0 :(得分:1)

我想你想要这个:

c(1,2,3,4.5) >= 1.1 & c(1,2,3,4.5) <= 4
[1] FALSE  TRUE  TRUE FALSE

检查1.1:4

的输出
1.1:4
[1] 1.1 2.1 3.1

您实际上在测试矢量中的元素是否完全等于 1.1,2.1或3.1

答案 1 :(得分:1)

您可以通过打印c(1.1:4)的内容找到答案。结果是[1] 1.1 2.1 3.1。你需要的是findInterval功能。看看这个解决方案:

findInterval(c(1,2,3,4.5), c(1.1,4)) == 1

如果你想拥有包容性的右边界,我。即[1.1,4]间隔,您可以使用rightmost.closed参数:

findInterval(c(1,2,3,4.5), c(1.1,4), rightmost.closed = TRUE) == 1

编辑:

以下是您所描述的更常见问题的解决方案:

d = data.frame(disease = c('d1', 'd2', 'd3'), minValue = c(0.3, 1.2, 2.2), maxValue = c(0.6, 1.9, 2.5))
measurements = c(0.1, 0.5, 2.2, 0.3, 2.7)

findDiagnosis <- function(data, measurement) {
  diagnosis = data[data$minValue <= measurement & measurement <= data$maxValue,]
  if (nrow(diagnosis) == 0) {
    return(NA)
  } else {
    return(diagnosis$disease)
  }
}

sapply(measurements, findDiagnosis, data = d)

答案 2 :(得分:1)

#This the list of your ranges that you want to check
ranges = list(c(0,.9), c(1.9,2.9), c(7.9,8.9), c(4.0,4.9), c(3,3.9), c(5,5.9), c(6,6.9), c(11,11.9), c(9,9.9), c(10,10.9), c(12.9), c(13,13.9), c(14),c(14.2), c(14.8))

#This is the values that you want to check for each range in ranges
values = c(1,2,3,4.5)

#You can check each value in each range with following command
output = data.frame(t(sapply(ranges, function(x) (min(x)<values & max(x)>values))))

#Maybe set column names to values so you know clearly what you are checking.
#Column names are values, row names are indexes of the ranges
colnames(output) = values
output$ranges = sapply(ranges, function(x) paste(x,collapse = "-"))