我有一个RFID数据序列。每行都包含一个时间戳和一个ID。我想删除重复的录音。数据看起来像这样
Row ID Date Time
1 A 1-13 12:03:11
2 B 1-13 12:03:12
3 A 1-13 12:06:06
4 B 1-13 12:16:25
5 A 1-13 12:16:52
6 A 1-13 12:16:53
7 A 1-13 12:16:54
8 B 1-13 12:39:46
9 B 1-13 12:41:20
10 B 1-13 12:41:20
11 B 1-13 12:41:21
12 B 1-13 12:42:20
13 B 1-13 12:42:24
14 A 1-13 12:51:37
15 A 1-13 12:51:38
我要删除那些显示与上一行中的记录相同的一秒内或在一秒钟后完成的记录的行。因此,在这种情况下,我想删除第2、6、7、10、11和15行。
有人可以帮我提供一个在整个数据集中自动执行此操作的代码吗?
答案 0 :(得分:1)
您可以使用as.POSIXct
转换日期列,然后应用diff
来获取时差,例如
v <- c("1-13 12:03:11", "1-13 12:03:12", "1-13 12:06:06", "1-13 12:16:25",
"1-13 12:16:52", "1-13 12:16:53", "1-13 12:16:54", "1-13 12:39:46",
"1-13 12:41:20", "1-13 12:41:20", "1-13 12:41:21", "1-13 12:42:20",
"1-13 12:42:24", "1-13 12:51:37", "1-13 12:51:38")
ind <- diff(as.POSIXct(v, format = "%m-%d %T")) <= 1
ind
# [1] TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE
然后您可以通过子设置删除行
# suppose your data frame is labelled df
df[!c(FALSE, ind),] # first row should be kept
答案 1 :(得分:1)
还有一个dplyr
选项。将Time
转换为日期时间对象,arrange
转换为Time
,filter
仅转换与上一行相差超过1秒的行。
library(dplyr)
df %>%
mutate(Time1 = as.POSIXct(Time, format = "%T")) %>%
arrange(Time1) %>%
filter(c(TRUE, diff(Time1) > 1)) %>%
select(-Time1)
# Row ID Date Time
#1 1 A 1-13 12:03:11
#2 3 A 1-13 12:06:06
#3 4 B 1-13 12:16:25
#4 5 A 1-13 12:16:52
#5 8 B 1-13 12:39:46
#6 9 B 1-13 12:41:20
#7 12 B 1-13 12:42:20
#8 13 B 1-13 12:42:24
#9 14 A 1-13 12:51:37
答案 2 :(得分:0)
稍微“ hacky”的解决方案:
// Before
export default connect((store) => {
return{
logged: store.auth.logged,
user: store.auth.user,
loginError: store.auth.loginError,
registerError: store.auth.registerError,
inputLogin: store.auth.inputLogin,
inputRegister: store.auth.inputRegister,
successMessage: store.auth.successMessage,
}
})(App)
// After
import { withRouter } from 'react-router-dom';
export default withRouter(connect((store) => {
return{
logged: store.auth.logged,
user: store.auth.user,
loginError: store.auth.loginError,
registerError: store.auth.registerError,
inputLogin: store.auth.inputLogin,
inputRegister: store.auth.inputRegister,
successMessage: store.auth.successMessage,
}
})(App))
结果:
library(dplyr)
new_df<-df %>%
mutate(To_Split=unlist(as.numeric(lapply(strsplit(df$Time,":"),function(x) x[3]))))
new_df %>%
mutate_at(vars(To_Split),list(function(x) x==lag(x)|x==(lag(x)+1))) %>%
filter(To_Split==FALSE|is.na(To_Split)) %>%
select(-To_Split)