数据集是这个
badData <- list(c(296,310), c(330,335), c(350,565))
df <- data.frame(wavelength = seq(300,360,5.008667),
reflectance = seq(-1,-61,-5.008667))
df
wavelength reflectance
300.0000 -1.000000
305.0087 -6.008667
310.0173 -11.017334
315.0260 -16.026001
320.0347 -21.034668
325.0433 -26.043335
330.0520 -31.052002
335.0607 -36.060669
340.0693 -41.069336
345.0780 -46.078003
350.0867 -51.086670
355.0953 -56.095337
最原始的问题是是否确定wavelength
是否落在badData
给定的范围内
提供的解决方案是这个
https://stackoverflow.com/a/52070363/1012249
我的问题正在使用类似的语法,如何识别它属于哪个badData
箱。可以说badData的结构是这样的,而bin是不重叠的。
badData <- data.frame(bin=c('a','b','c'),start= c(296,330,350),end=c(310.01,335,565))
答案 0 :(得分:2)
以下是使用模糊连接的示例:
library(fuzzyjoin)
df %>%
fuzzy_left_join(badData, #join badData to df
by = c("wavelength" = "start", #variables to join by
"wavelength" = "end"),
match_fun=list(`>=`, `<=`)) #functions to use for each par of variables so "wavelength" >= "start" and "wavelength" <= "end" is the logic here
#output
wavelength reflectance bin start end
1 300.0000 -1.000000 a 296 310.01
2 305.0087 -6.008667 a 296 310.01
3 310.0173 -11.017334 <NA> NA NA
4 315.0260 -16.026001 <NA> NA NA
5 320.0347 -21.034668 <NA> NA NA
6 325.0433 -26.043335 <NA> NA NA
7 330.0520 -31.052002 b 330 335.00
8 335.0607 -36.060669 <NA> NA NA
9 340.0693 -41.069336 <NA> NA NA
10 345.0780 -46.078003 <NA> NA NA
11 350.0867 -51.086670 c 350 565.00
12 355.0953 -56.095337 c 350 565.00
答案 1 :(得分:1)
您不需要循环。您可以简单地使用cut
:
badData <- data.frame(bin=c('a','b','c'),start= c(296,330,350),end=c(310.01,335,565))
df <- data.frame(wavelength = seq(300,360,5.008667),
reflectance = seq(-1,-61,-5.008667))
df$bins <- cut(df$wavelength, t(badData[, c("start", "end")]),
labels = head(c(t(cbind(as.character(badData$bin), "good"))), -1))
# wavelength reflectance bins
#1 300.0000 -1.000000 a
#2 305.0087 -6.008667 a
#3 310.0173 -11.017334 good
#4 315.0260 -16.026001 good
#5 320.0347 -21.034668 good
#6 325.0433 -26.043335 good
#7 330.0520 -31.052002 b
#8 335.0607 -36.060669 good
#9 340.0693 -41.069336 good
#10 345.0780 -46.078003 good
#11 350.0867 -51.086670 c
#12 355.0953 -56.095337 c
您还没有说应该打开或关闭间隔的哪一侧,但是可以调整。