从RFID数据集中删除重复记录的行

时间:2019-03-31 14:36:47

标签: r

我有一个RFID数据序列。每行都包含一个时间戳和一个ID。我想删除重复的录音。数据看起来像这样

Row ID  Date     Time
1   A   1-13    12:03:11
2   B   1-13    12:03:12
3   A   1-13    12:06:06
4   B   1-13    12:16:25
5   A   1-13    12:16:52
6   A   1-13    12:16:53
7   A   1-13    12:16:54
8   B   1-13    12:39:46
9   B   1-13    12:41:20
10  B   1-13    12:41:20
11  B   1-13    12:41:21
12  B   1-13    12:42:20
13  B   1-13    12:42:24
14  A   1-13    12:51:37
15  A   1-13    12:51:38

我要删除那些显示与上一行中的记录相同的一秒内或在一秒钟后完成的记录的行。因此,在这种情况下,我想删除第2、6、7、10、11和15行。

有人可以帮我提供一个在整个数据集中自动执行此操作的代码吗?

3 个答案:

答案 0 :(得分:1)

您可以使用as.POSIXct转换日期列,然后应用diff来获取时差,例如

v <- c("1-13 12:03:11", "1-13 12:03:12", "1-13 12:06:06", "1-13 12:16:25", 
       "1-13 12:16:52", "1-13 12:16:53", "1-13 12:16:54", "1-13 12:39:46", 
       "1-13 12:41:20", "1-13 12:41:20", "1-13 12:41:21", "1-13 12:42:20", 
       "1-13 12:42:24", "1-13 12:51:37", "1-13 12:51:38")
ind <- diff(as.POSIXct(v, format = "%m-%d %T")) <= 1
ind
# [1]  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE

然后您可以通过子设置删除行

# suppose your data frame is labelled df
df[!c(FALSE, ind),]    # first row should be kept

答案 1 :(得分:1)

还有一个dplyr选项。将Time转换为日期时间对象,arrange转换为Timefilter仅转换与上一行相差超过1秒的行。

library(dplyr)

df %>%
  mutate(Time1 = as.POSIXct(Time, format = "%T")) %>%
  arrange(Time1) %>%
  filter(c(TRUE, diff(Time1) > 1)) %>%
  select(-Time1)

#  Row ID Date     Time
#1   1  A 1-13 12:03:11
#2   3  A 1-13 12:06:06
#3   4  B 1-13 12:16:25
#4   5  A 1-13 12:16:52
#5   8  B 1-13 12:39:46
#6   9  B 1-13 12:41:20
#7  12  B 1-13 12:42:20
#8  13  B 1-13 12:42:24
#9  14  A 1-13 12:51:37

答案 2 :(得分:0)

稍微“ hacky”的解决方案:

// Before

export default connect((store) => {
      return{
        logged: store.auth.logged,
        user: store.auth.user,
        loginError: store.auth.loginError,
        registerError: store.auth.registerError,
        inputLogin: store.auth.inputLogin,
        inputRegister: store.auth.inputRegister,
        successMessage: store.auth.successMessage,
      }
    })(App)

// After

import { withRouter } from 'react-router-dom';

export default withRouter(connect((store) => {
      return{
        logged: store.auth.logged,
        user: store.auth.user,
        loginError: store.auth.loginError,
        registerError: store.auth.registerError,
        inputLogin: store.auth.inputLogin,
        inputRegister: store.auth.inputRegister,
        successMessage: store.auth.successMessage,
      }
    })(App))

结果:

library(dplyr)
new_df<-df %>% 
  mutate(To_Split=unlist(as.numeric(lapply(strsplit(df$Time,":"),function(x) x[3]))))


  new_df %>% 
  mutate_at(vars(To_Split),list(function(x) x==lag(x)|x==(lag(x)+1))) %>% 
  filter(To_Split==FALSE|is.na(To_Split)) %>% 
  select(-To_Split)