Question

我有一个数据框df和一个索引列表L，我应该把它放在0而不是当前的df值。

示例：

DF：

# A tibble: 11 x 3
      A     B     C
    <dbl> <dbl> <dbl>
    1724     4  2013
    1758     4  2013
    1612     3  2013
    1692     3  2013
    1260    33  2014
    1157    22  2014
    1359    63  2014
    1414    27  2014
    387     3  2016
    374     3  2016

L：

[[1]]
[1] 3 4

[[2]]
[1] 1 2 3 4 5

[[3]]
[1] 1

所以在这个例子中，我必须将0列放在A列的第3,4行，B列的1：5行和C列的第1行。

有没有办法在R中作为单行代码？ dplyr或R-base解决方案会很棒！另外，我想避免应用或循环，因为我必须非常有效地执行此操作

Answer 1

使用索引矩阵的另一种方式：

# DF <- read.table(textConnection('A     B  C
#     1724     4  2013
#     1758     4  2013
#     1612     3  2013
#     1692     3  2013
#     1260    33  2014
#     1157    22  2014
#     1359    63  2014
#     1414    27  2014
#     387     3  2016
#     374     3  2016'), header = T)
# 
# L <- list(c(3, 4), c(1, 2, 3, 4, 5), c(1))


Lcol <- rep(seq_along(L), lengths(L))
DF[cbind(unlist(L), Lcol)] <- 0

# > DF
#       A  B    C
# 1  1724  0    0
# 2  1758  0 2013
# 3     0  0 2013
# 4     0  0 2013
# 5  1260  0 2014
# 6  1157 22 2014
# 7  1359 63 2014
# 8  1414 27 2014
# 9   387  3 2016
# 10  374  3 2016

Answer 2

Loop看起来非常快。 Haven没有进行复杂性比较，但是如果您以列表形式替换，并希望替换为＆＃39; val＆＃39;，只需简单地说：

df
    a  b  c
1   1  1  1
2   2  2  2
3   3  3  3
4   4  4  4
5   5  5  5
6   6  6  6
7   7  7  7
8   8  8  8
9   9  9  9
10 10 10 10

val<-0
for(i in 1:length(L)){
  df[L[[i]],i]<-val
}

df
    a  b  c
1   1  0  0
2   2  0  2
3   0  0  3
4   0  0  4
5   5  0  5
6   6  6  6
7   7  7  7
8   8  8  8
9   9  9  9
10 10 10 10

我在x，10,000行和10,0000列df上测试了它：

> b<-Sys.time()
> for(i in 1:length(L)){
+ x[L[[i]],i]<-0
+ }
> Sys.time()-b
Time difference of 0.490464 secs

看起来很快:)我知道它很明显，但希望它有所帮助！

********编辑1 ********

如果我们使用unlist和cbind来查看@ mt1022的方法：

> b<-Sys.time()
> Lcol <- rep(seq_along(L), lengths(L))
> x[cbind(unlist(L), Lcol)] <- 0
> Sys.time()-b
Time difference of 7.467723 secs

显然要慢得多（因为当我们取消列表时，我们只是循环遍历L中的每个元素而不是L中的每个向量）。 ;）

Answer 3

另一种选择是将mapply与do.call结合使用。

  do.call(cbind, mapply(function(x,y){
    df[x,y]<-0
    df[y]
  }, mylist, seq_along(mylist)))

  #         A  B    C
  # [1,] 1724  0    0
  # [2,] 1758  0 2013
  # [3,]    0  0 2013
  # [4,]    0  0 2013
  # [5,] 1260  0 2014
  # [6,] 1157 22 2014
  # [7,] 1359 63 2014
  # [8,] 1414 27 2014
  # [9,]  387  3 2016
  # [10,]  374  3 2016

数据：

df <- read.table(text = "A B C 1724 4 2013 1758 4 2013 1612 3 2013 1692 3 2013 1260 33 2014 1157 22 2014 1359 63 2014 1414 27 2014 387 3 2016 374 3 2016", header = TRUE) mylist <- list(c(3, 4), c(1, 2, 3, 4, 5), c(1))

在R中，访问存储在列表中的索引

3 个答案: