我有以下数据
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1 14:00"),by="hour")
data<-xts(values,order.by=time1)
data
[,1]
2013-01-01 00:00:00 2
2013-01-01 01:00:00 2
2013-01-01 02:00:00 2
2013-01-01 03:00:00 4
2013-01-01 04:00:00 2
2013-01-01 05:00:00 3
2013-01-01 06:00:00 0
2013-01-01 07:00:00 0
2013-01-01 08:00:00 0
2013-01-01 09:00:00 0
2013-01-01 10:00:00 0
2013-01-01 11:00:00 1
2013-01-01 12:00:00 2
2013-01-01 13:00:00 3
2013-01-01 14:00:00 2
现在我想删除所有零,这可以通过
轻松实现remove_zerro = apply(data, 1, function(row) all(row !=0 ))
data[remove_zerro,]
问题是,在我使用没有零的数据并进行一些修改后,我想在相同的日期和时间将零重新插入我的数据。任何想法都会受到赞赏
答案 0 :(得分:1)
以下是两种可能的方法:
# re-create your data set
library(xts)
values<-c(2,2,2,4,2,3,0,0,0,0,0,1,2,3,2)
time1<-seq(from=as.POSIXct("2013-01-01 00:00"),to=as.POSIXct("2013-01-1 14:00"),by="hour")
data<-xts(values,order.by=time1)
data
###############################################
# SOLUTION 1 :
# make a union of the "zero" series and the "zero-free" series
# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
zeroSeries <- data[!isNotZero,]
# do you calculations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10
# union
unionSeries <- rbind(zeroSeries,zeroFreeSeries)
# now unionSeries contains what you desire
unionSeries
###############################################
# SOLUTION 2 :
# keep the original series copy and after doing your operations
# on the "zero-free" series, update the original series copy with
# with the new values (it doesn't work well if you remove some date from the
# zero-free series)
# create a copy of data with no zero
isNotZero = apply(data, 1, function(row) all(row != 0 ))
zeroFreeSeries <- data[isNotZero,]
# do you operations on the "zero-free" series (e.g. add 10 to all values)
zeroFreeSeries <- zeroFreeSeries + 10
# modify the original data by setting the new values
data[time(zeroFreeSeries),] <- zeroFreeSeries
# now data contains what you desire
data
答案 1 :(得分:1)
您似乎可能希望使用稀疏向量/矩阵:
install.packages("spam")
library(spam)
sx <- c(0,0,3, 3.2, 0,0,0,-3:1,0,0,2,0,0,5,0,0)
apply.spam(spam(sx), NULL, function(x){1 / x})
[,1]
[1,] 0.0000000
[2,] 0.0000000
[3,] 0.3333333
[4,] 0.3125000
[5,] 0.0000000
[6,] 0.0000000
[7,] 0.0000000
[8,] -0.3333333
[9,] -0.5000000
[10,] -1.0000000
[11,] 0.0000000
[12,] 1.0000000
[13,] 0.0000000
[14,] 0.0000000
[15,] 0.5000000
[16,] 0.0000000
[17,] 0.0000000
[18,] 0.2000000
[19,] 0.0000000
[20,] 0.0000000
如果您使用零值执行此操作:
> apply(matrix(sx), 1, function(x){1 / x})
[1] Inf Inf 0.3333333 0.3125000 Inf Inf
[7] Inf -0.3333333 -0.5000000 -1.0000000 Inf 1.0000000
[13] Inf Inf 0.5000000 Inf Inf 0.2000000
[19] Inf Inf
因此,您可以看到apply.spam
忽略零,但会自动将它们放回
缺点是您必须在处理后重新附加时间标签。
答案 2 :(得分:0)
我正在建立@ zx8754的评论。
一种方法是拆分数据框。如果您担心弄乱索引或将数据框连接在一起,那么下面是另一种方法。
创建T / F索引。
idx <- df[,col] != 0
df$col[idx] <- 2007 # or whatever operation.
答案 3 :(得分:0)
显然这是解决方案
no<-data[ data[,1] != 0, ] #data without zeros
yes<-data[ data[,1] == 0, ]# data with only zeros
together<-c(no, yes)# both data combined together