Question

我有以下数据

library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1   14:00"),by="hour")
data<-xts(values,order.by=time1)
data

  [,1]
2013-01-01 00:00:00    2
2013-01-01 01:00:00    2
2013-01-01 02:00:00    2
2013-01-01 03:00:00    4
2013-01-01 04:00:00    2
2013-01-01 05:00:00    3
2013-01-01 06:00:00    0
2013-01-01 07:00:00    0
2013-01-01 08:00:00    0
2013-01-01 09:00:00    0
2013-01-01 10:00:00    0
2013-01-01 11:00:00    1
2013-01-01 12:00:00    2
2013-01-01 13:00:00    3
2013-01-01 14:00:00    2

现在我想删除所有零，这可以通过

轻松实现

remove_zerro = apply(data, 1, function(row) all(row !=0 ))
data[remove_zerro,]

问题是，在我使用没有零的数据并进行一些修改后，我想在相同的日期和时间将零重新插入我的数据。任何想法都会受到赞赏

Answer 1

以下是两种可能的方法：

# re-create your data set
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1   14:00"),by="hour")
data<-xts(values,order.by=time1)
data

###############################################
# SOLUTION 1 : 
# make a union of the "zero" series and the "zero-free" series

# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
zeroSeries <- data[!isNotZero,]

# do you calculations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10

# union
unionSeries <- rbind(zeroSeries,zeroFreeSeries)

# now unionSeries contains what you desire
unionSeries

###############################################
# SOLUTION 2 : 
# keep the original series copy and after doing your operations
# on the "zero-free" series, update the original series copy with
# with the new values (it doesn't work well if you remove some date from the 
# zero-free series)

# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]

# do you operations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10

# modify the original data by setting the new values
data[time(zeroFreeSeries),] <- zeroFreeSeries

# now data contains what you desire
data

Answer 2

您似乎可能希望使用稀疏向量/矩阵：

install.packages("spam")
library(spam)
sx <- c(0,0,3, 3.2, 0,0,0,-3:1,0,0,2,0,0,5,0,0)
apply.spam(spam(sx), NULL, function(x){1 / x})
           [,1]
 [1,]  0.0000000
 [2,]  0.0000000
 [3,]  0.3333333
 [4,]  0.3125000
 [5,]  0.0000000
 [6,]  0.0000000
 [7,]  0.0000000
 [8,] -0.3333333
 [9,] -0.5000000
[10,] -1.0000000
[11,]  0.0000000
[12,]  1.0000000
[13,]  0.0000000
[14,]  0.0000000
[15,]  0.5000000
[16,]  0.0000000
[17,]  0.0000000
[18,]  0.2000000
[19,]  0.0000000
[20,]  0.0000000

如果您使用零值执行此操作：

> apply(matrix(sx), 1, function(x){1 / x})
 [1]        Inf        Inf  0.3333333  0.3125000        Inf        Inf
 [7]        Inf -0.3333333 -0.5000000 -1.0000000        Inf  1.0000000
[13]        Inf        Inf  0.5000000        Inf        Inf  0.2000000
[19]        Inf        Inf

因此，您可以看到apply.spam忽略零，但会自动将它们放回

缺点是您必须在处理后重新附加时间标签。

Answer 3

我正在建立@ zx8754的评论。

一种方法是拆分数据框。如果您担心弄乱索引或将数据框连接在一起，那么下面是另一种方法。

创建T / F索引。

idx <- df[,col] != 0
df$col[idx] <- 2007 # or whatever operation.

Answer 4

显然这是解决方案

no<-data[ data[,1] != 0, ] #data without zeros
yes<-data[ data[,1] == 0, ]# data with only zeros

together<-c(no, yes)# both data combined together

删除零并按时间顺序添加它们

4 个答案: