Question

对于给定的数据表，请参阅下面的示例，我只想通过Unique_ID为大于2的值保留差异列，而不删除NA行。

      My_data_table
Unique_ID Distance(km)  Difference            
3AA        2             NA          
3AA        4             2          
5AA        2             NA          
5AA        4             2          
5AA        7             3

以下是我正在寻找

的结果

# Assuming `x` & `y` are floats with the coordinates of the top-left corner:
xmin = x
ymin = y

# Assuming `width` & `height` are floats with the size of the box
xmax = x + width
ymax = y + height

Answer 1

转换为＆data; data.table＆＃39; （setDT(df1)），按＆＃39; Unique_ID＆＃39;分组，if逻辑向量sum（Difference >= 2）大于0，然后获取子集Data.table（.SD）其中＆＃39;差异＆＃39;是NA或|大于或等于2

library(data.table)
setDT(df1)[,  if(sum(Difference >=2, na.rm = TRUE)>0) 
                .SD[is.na(Difference)|Difference>=2], by = Unique_ID]
#     Unique_ID Distance.km. Difference
#1:       3AA            2         NA
#2:       3AA            4          2
#3:       5AA            2         NA
#4:       5AA            4          2
#5:       5AA            7          3

Answer 2

dplyr解决方案：

library(dplyr)

df %>%
  group_by(Unique_ID) %>%
  filter(any(Difference >= 2 & !is.na(Difference)))
# # A tibble: 5 x 3
# # Groups:   Unique_ID [2]
#   Unique_ID Distance.km. Difference
#      <fctr>        <dbl>      <dbl>
# 1       3AA            2         NA
# 2       3AA            4          2
# 3       5AA            2         NA
# 4       5AA            4          2
# 5       5AA            7          3

R中的子集滞后值

2 个答案: