对于给定的数据表,请参阅下面的示例,我只想通过Unique_ID为大于2的值保留差异列,而不删除NA行。
My_data_table
Unique_ID Distance(km) Difference
3AA 2 NA
3AA 4 2
5AA 2 NA
5AA 4 2
5AA 7 3
以下是我正在寻找
的结果# Assuming `x` & `y` are floats with the coordinates of the top-left corner:
xmin = x
ymin = y
# Assuming `width` & `height` are floats with the size of the box
xmax = x + width
ymax = y + height
答案 0 :(得分:3)
转换为&data; data.table' (setDT(df1)
),按' Unique_ID'分组,if
逻辑向量sum
(Difference >= 2
)大于0,然后获取子集Data.table(.SD
)其中'差异'是NA
或|
大于或等于2
library(data.table)
setDT(df1)[, if(sum(Difference >=2, na.rm = TRUE)>0)
.SD[is.na(Difference)|Difference>=2], by = Unique_ID]
# Unique_ID Distance.km. Difference
#1: 3AA 2 NA
#2: 3AA 4 2
#3: 5AA 2 NA
#4: 5AA 4 2
#5: 5AA 7 3
答案 1 :(得分:0)
dplyr
解决方案:
library(dplyr)
df %>%
group_by(Unique_ID) %>%
filter(any(Difference >= 2 & !is.na(Difference)))
# # A tibble: 5 x 3
# # Groups: Unique_ID [2]
# Unique_ID Distance.km. Difference
# <fctr> <dbl> <dbl>
# 1 3AA 2 NA
# 2 3AA 4 2
# 3 5AA 2 NA
# 4 5AA 4 2
# 5 5AA 7 3