我试图删除卷等于0的行,以及这些行正下方的行。所以,对于下面的df,我想删除第3,4行:
head(data1)
open high low close volume adj.
2013-12-23 6.32 6.36 6.21 6.22 329400 6.22
2013-12-24 6.27 6.36 6.22 6.30 126500 6.30
2013-12-25 6.30 6.30 6.30 6.30 0 6.30
2013-12-26 6.30 6.36 6.23 6.23 126600 6.23
2013-12-27 6.26 6.28 6.20 6.24 54000 6.24
2013-12-30 6.24 6.50 6.24 6.44 61000 6.44
我有一个有效的解决方案,但是令人尴尬的漫长而草率:
if.zero.or.not <- as.data.frame(data1$volume == 0)
combined.data = bind_cols(data1, if.zero.or.not )
colnames(combined.data) = c('open', 'high', 'low', 'close', 'volume', 'adj.', 'ifzero')
combined.data.shifted = transform(combined.data, ifzero = lag(ifzero))
zeros.and.trues.removed = subset(trues.removed, volume != 0, ifzero != T)
我怎么能用一两行呢?
答案 0 :(得分:3)
我会写data.table
因为我更喜欢语法; base
的翻译很简单。
library(xts) #Needed to get the following "xts" "zoo" object
data1 <- structure(c(6.32, 6.27, 6.3, 6.3, 6.26, 6.24, 6.36, 6.36, 6.3,
6.36, 6.28, 6.5, 6.21, 6.22, 6.3, 6.23, 6.2, 6.24, 6.22, 6.3,
6.3, 6.23, 6.24, 6.44, 329400, 126500, 0, 126600, 54000, 61000,
6.22, 6.3, 6.3, 6.23, 6.24, 6.44), .Dim = c(6L, 6L), .Dimnames = list(
NULL, c("open", "high", "low", "close", "volume", "adj.")), index = structure(c(1387756800,
1387843200, 1387929600, 1388016000, 1388102400, 1388361600), tzone = "UTC", tclass = "Date"), .indexCLASS = "Date", tclass = "Date", .indexTZ = "UTC", tzone = "UTC", class = c("xts",
"zoo"))
library(data.table)
#setDT fails on "xts" "zoo" object. We need as.data.table
#setDT(data1) #convert to native 'data.table' class _by reference_
data1 <- as.data.table(data1)
data1[if (!length(rows <- -c(idx <- which(volume == 0), (if (volume[.N] == 0) idx[-length(idx)] else idx) + 1L))) TRUE else rows]
如果你的表非常庞大并且有很多聚类零,那么在c(...)
中包裹unique
应该更有效率。
如果你有结构性的理由知道最后一行不会为零,那么这个版本的眼睛就更容易了:
data1[if (!length(rows <- -c(idx <- which(volume == 0), idx + 1L))) TRUE else rows]
答案 1 :(得分:0)
这可能会有所帮助。这是一个示例数据示例。
a <- c(1,0,9,7,5,0,7,0)
b <- c(1,9,6,7,4,5,7,8)
dc < -data.frame(a,b)
dc_removed_zero_and_the_next_row <- dc[-c(which(dc$a==0),which(dc$a==0)+1),]