Question

我有两个数据框，A 和 B，通过导出 csv 数据构建，可以恢复如下（非常简化）：

dataA <- read.csv2("dataA.csv", header = TRUE))

#       Name        DataA_1     DataA_2     DataA_3       DataA_4        
#        1            4            5            6            5                        
#        2            7            5            6            4                       
#        3            6            5            5            4                        
#        4            3            3            3            4                        
#        5            1            2            4            3

dataB <- read.csv2("dataB.csv", header = TRUE)

#     DataB_1  DataB_2  DataB_3  DataB_4 
#      1        8        3        5

我想要做的就是用数据框 B 中的同一列对应的值替换数据框 A（第一列除外）中高于某个数量的所有值（比方说 4）。例如，由于 DataA_2 对于 te 2nd person (element (2,2) in A) 来说是 5，我想用数据帧 B 的 DataB_2 替换它，即 8。最终结果应该看起来像这个：

#       Name        DataA_1     DataA_2     DataA_3       DataA_4        
#        1            4            8            3            5                        
#        2            1            8            3            4                       
#        3            1            8            3            4                        
#        4            3            3            3            4                        
#        5            1            2            2            3

我已经通过构建带有循环的算法来资助一种方法，但我对这个解决方案并不满意，因为我想要更短、更快的东西。我很确定可以使用 transmute 中的 library(dplyr) 之类的函数，但我找不到解决方案。如果有人知道如何使用 transmute 或其他函数，请告诉我！

Answer 1

我们可以创建一个逻辑索引并使用它来对替换值进行子集化

i1 <- dataA[-1] > 4
dataA[-1][i1] <- dataB[col(dataA[-1])][i1]

-输出

dataA
#  Name DataA_1 DataA_2 DataA_3 DataA_4
#1    1       4       8       3       5
#2    2       1       8       3       4
#3    3       1       8       3       4
#4    4       3       3       3       4
#5    5       1       2       4       3

或者用dplyr

library(dplyr)
library(stringr)
dataA %>% 
     mutate(across(-Name, ~ replace(., . > 4, 
          dataB[[str_replace(cur_column(), 'A', 'B')]])))
#  Name DataA_1 DataA_2 DataA_3 DataA_4
#1    1       4       8       3       5
#2    2       1       8       3       4
#3    3       1       8       3       4
#4    4       3       3       3       4
#5    5       1       2       4       3

数据

dataA <- structure(list(Name = 1:5, DataA_1 = c(4L, 7L, 6L, 3L, 1L), DataA_2 = c(5L, 
5L, 5L, 3L, 2L), DataA_3 = c(6L, 6L, 5L, 3L, 4L), DataA_4 = c(5L, 
4L, 4L, 4L, 3L)), class = "data.frame", row.names = c(NA, -5L
))

dataB <- structure(list(DataB_1 = 1L, DataB_2 = 8L, DataB_3 = 3L, DataB_4 = 5L), class = "data.frame", row.names = c(NA, 
-1L))

Answer 2

基本的 R 选项

dataA[-1] <- (dataA[-1] <= 4) * dataA[-1] + (dataA[-1] > 4) * dataB[rep(1, nrow(dataA)), ]

给予

> dataA
  Name DataA_1 DataA_2 DataA_3 DataA_4
1    1       4       8       3       5
2    2       1       8       3       4
3    3       1       8       3       4
4    4       3       3       3       4
5    5       1       2       4       3

将一个数据帧中的值替换为另一个数据帧

2 个答案:

数据