与this one非常相似的问题,但存在一些根本区别。
我有一个时间戳,4个测量列和4个状态列的数据集:
structure(list(Timestamp = structure(c(1409544002, 1409544006,
1409544010, 1409544014, 1409544018, 1409544022), class = c("POSIXct",
"POSIXt"), tzone = ""), A = c(0, 0, 0, 0, 0, 0), B = c(20.77579,
21.05727, 21.81632, 21.36299, 21.18629, 21.34721), C = c(16.25537,
16.45496, 16.70933, 16.1526, 16.60963, 16.76558), D = c(0, 0,
0, 0, 0, 0), SA = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"0"), class = "factor"), SB = structure(c(1L, 1L, 1L, 1L, 1L,
1L), .Label = c("1", "0"), class = "factor"), SC = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("1", "0"), class = "factor"),
SD = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"0"), class = "factor")), .Names = c("Timestamp", "A", "B",
"C", "D", "SA", "SB", "SC", "SD"), row.names = c(NA, 6L), class = "data.frame")
我想计算打开的列的中位数,如S *列中的1所示。
到目前为止,我可以使用以下方法逐行查找哪些测量列:
foo[i, c(which(x = foo[i, 6:9] == 1, arr.ind = FALSE) + 1)]
其中i
是行号。
就我而言,没有我的代码变得过于复杂。我以为我可以通过将上面的代码行(在逐行for
循环之后)到时间戳之后绑定我创建一个新的数据框,用NAs填充空白点,计算中位数该数据帧,最后将中位数绑定到原始数据帧。但必须有更好的方法!
有什么想法吗?
编辑:
输出应如下所示:
structure(list(Timestamp = structure(c(1409544002, 1409544006,
1409544010, 1409544014, 1409544018, 1409544022), class = c("POSIXct",
"POSIXt"), tzone = ""), A = c(0, 0, 0, 0, 0, 0), B = c(20.77579,
21.05727, 21.81632, 21.36299, 21.18629, 21.34721), C = c(16.25537,
16.45496, 16.70933, 16.1526, 16.60963, 16.76558), D = c(0, 0,
0, 0, 0, 0), SA = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"0"), class = "factor"), SB = structure(c(1L, 1L, 1L, 1L, 1L,
1L), .Label = c("1", "0"), class = "factor"), SC = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = c("1", "0"), class = "factor"),
SD = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"0"), class = "factor"), Median = c(18.51558, 18.756115,
19.262825, 18.757795, 18.89796, 19.056395)), .Names = c("Timestamp",
"A", "B", "C", "D", "SA", "SB", "SC", "SD", "Median"), row.names = c(NA,
6L), class = "data.frame")
答案 0 :(得分:1)
这有点乱,因为您的S*
列是因素。如果您将它们转换为数字或逻辑,则可以跳过以下第二行代码:
w <- grepl("^S", names(foo))
m <- matrix(as.logical(as.numeric(as.matrix(foo[, w]))), ncol = sum(w))
foo$Median <- apply(`[<-`(as.matrix(foo[,LETTERS[1:4]]), !m, NA), 1, median, na.rm=TRUE)
foo
# Timestamp A B C D SA SB SC SD Median
# 1 2014-09-01 06:00:02 0 20.77579 16.25537 0 0 1 1 0 18.51558
# 2 2014-09-01 06:00:06 0 21.05727 16.45496 0 0 1 1 0 18.75612
# 3 2014-09-01 06:00:10 0 21.81632 16.70933 0 0 1 1 0 19.26282
# 4 2014-09-01 06:00:14 0 21.36299 16.15260 0 0 1 1 0 18.75780
# 5 2014-09-01 06:00:18 0 21.18629 16.60963 0 0 1 1 0 18.89796
# 6 2014-09-01 06:00:22 0 21.34721 16.76558 0 0 1 1 0 19.05640