概述

Question

我希望那里的某个人可以帮助我找到我在R中使用我的代码时遇到的令人沮丧的问题的根源。我有一个由数据框组成的列表，我想要左边将每个元素连接到两个OTHER数据帧之一（称为A和B）。要加入哪些辅助数据帧取决于元素在列表中的位置。为了我的目的，我希望每个奇数元素都被左连接到A，并且每个偶数元素都被左连接到B.

library(dplyr)
DF <- data.frame(Num = c("1","2"), Let = c("a","b"), stringsAsFactors = FALSE)
A <- data.frame(Let = c("a","b"), Col = c("Yellow","Red"), stringsAsFactors = FALSE)
B <- data.frame(Let = c("a","b"), Col = c("Green","Blue"), stringsAsFactors = FALSE)
LIST <- list(DF, DF)

到目前为止，我已尝试以两种不同的方式做到这一点。第一种方法涉及if-else语句。如果我应用这样的语句来根据位置分配一个整数值，我会得到预期的结果。同样，当我离开时使用if-else语句，只需在列表中执行一系列左连接元素，一切都按预期工作。

lapply(seq_along(LIST), function(x, {ifelse((x %% 2)==0, y[[x]] <- 1, y[[x]] <- 2)}, y = LIST)
lapply(seq_along(LIST), function(x, {left_join(y[[x]], A, by = c("Let"))}, y = LIST)

当我遇到问题时，我尝试将if-else语句和左连接组合在一起。特别是，我最终得到一个由列表组成的列表其中仅保留原始相应数据帧的第一列。

lapply(seq_along(LIST), function(x, y) {ifelse((x %% 2)==0, left_join(y[[x]], A, by = c("Let")), left_join(y[[x]], B, by = c("Let")))}, y = LIST)

这是我想要获得的输出：

[[1]]
  Let Num    Col
1   a   1 Yellow
2   b   2    Red

[[2]]
  Let Num   Col
1   a   1 Green
2   b   2  Blue

我确定这个问题是一个荒谬简单的解决方案。有谁能看到它？

提前致谢！马修

P.S。：我还尝试了第二种方法，应用子集而不是if-else语句。然而，我再次陷入困境。下面的第一行按预期工作，但第二行返回错误，好像R不识别列表索引：

lapply(seq_along(LIST), function(x, y) {left_join(y[[x > 0]], A, by = c("Let"))}, y = LIST)
lapply(seq_along(LIST), function(x, y) {left_join(y[[x == 1]], A, by = c("Let"))}, y = LIST)

Error in y[[x == 1]] : attempt to select less than one element in integerOneIndex

Answer 1

我不完全确定我理解你的问题。

以下解决方案基于您的附言中lapply(seq_along(LIST), function(x, y) {left_join(y[[x > 0]], A, by = c("Let"))}, y = LIST)的输出的再现。请注意，其他lapply行会抛出错误。

library(tidyverse);
map(list(A, B), function(x) left_join(DF, x))
#Joining, by = "Let"
#Joining, by = "Let"
#[[1]]
#  Num Let    Col
#1   1   a Yellow
#2   2   b    Red
#
#[[2]]
#  Num Let   Col
#1   1   a Green
#2   2   b  Blue

我们使用purrr:map与dplyr::left_join加入A和B加入DF。

使用Map和merge：

可以在基础R中实现相同的目标

mapply(function(x) merge(DF, x, by = "Let"), list(A, B), SIMPLIFY = F)
#[[1]]
#  Let Num    Col
#1   a   1 Yellow
#2   b   2    Red
#
#[[2]]
#  Let Num   Col
#1   a   1 Green
#2   b   2  Blue

Answer 2

概述

使用base::mapply()返回有条件合并的数据框列表。在这里，我提供了两个输入：

seq.along( along.with = LIST )获取System.setProperty("webdriver.chrome.driver","/usr/bin/chromedriver"); WebDriver driver = new ChromeDriver(); driver.get("https://mvnrepository.com"); driver.close();中的元素数量;和
LIST本身。

LIST参数是一个匿名函数，它接收两个输入 - FUN和i - 并测试j中的当前元素是偶数还是奇数编号使用left-join执行base::merge()。

如果LIST中i ^th元素的result of the modulus operator等于零，则左连接seq.along( along.with = LIST )到j ^{th < B中的/ sup>元素;如果它不等于零，则在LIST中的j ^th元素上执行left-join A。}

LIST

Tidyverse方法

以下使用purrr和dplyr软件包中的函数复制结果。

# load data
DF <- data.frame(Num = c("1","2"), Let = c("a","b"), stringsAsFactors = FALSE)
A <- data.frame(Let = c("a","b"), Col = c("Yellow","Red"), stringsAsFactors = FALSE)
B <- data.frame(Let = c("a","b"), Col = c("Green","Blue"), stringsAsFactors = FALSE)
LIST <- list(DF, DF)

# goal: left join all odd elements in LIST[[j]]
#       to `A` and all even elements to `B`
merged.list <- 
  mapply( FUN = function( i, j )
          if( i %% 2 == 0 ){
            merge( x = j
                   , y = B
                   , by = "Let"
                   , all.x = TRUE )
          } else{
            merge( x = j
                   , y = A
                   , by = "Let"
                   , all.x = TRUE )
          }
        , seq_along( along.with = LIST )
        , LIST
        , SIMPLIFY = FALSE )

# view results
merged.list
# [[1]]
# Let Num    Col
# 1   a   1 Yellow
# 2   b   2    Red
# 
# [[2]]
# Let Num   Col
# 1   a   1 Green
# 2   b   2  Blue

# end of script #

Answer 3

MauritsEvers已经回答了你的问题，但我认为我会解决R语法和编程逻辑中的明显错误。专注于第一次lapply-call：

lapply(seq_along(LIST), function(x, {ifelse((x %% 2)==0, y[[x]] <- 1, y[[x]] <- 2)}, y = LIST)

首先，也许是微不足道的是第一个) - 动作中缺少的结束lapply参数列表。接下来更基本的是错误地使用ifelse作为编程结构。 ifelse函数不是为数据对象的串行测试而设计的。它仅设计为沿单个矢量应用。如果要进行连续选择，if(.){.}else{.} - 函数可能应该在lapply调用中使用。

然而，（现在尝试实现第一段而不是继续纠正代码）我认为在LIST上使用逻辑索引（使用R＆＃39;隐式回收过程）会更简单 - 对象而不是任何循环过程。（这是不一个整数解决方案。）此代码将LIST分段为＆＃34; odd＆＃34;和＆＃34;甚至＆＃34;组件：

  oddList <- LIST[ c(TRUE,FALSE) ]  # implicit seq-along by virtue of recycling
  evenList <- LIST[ c(FALSE,TRUE) ]

我们可以使用这种类型的结果来制作两个完成既定目标的单线程。我把LIST-object设为四宽而不是两宽。

Abig <- Reduce( function(x,y) {merge(x,y,by="Let")}, LIST, init=A)
Warning message:
In merge.data.frame(x, y, by = "Let") :
  column names ‘Num.x’, ‘Num.y’ are duplicated in the result
Bbig <- Reduce( function(x,y) {merge(x,y,by="Let")}, LIST, init=B)
Warning message:
In merge.data.frame(x, y, by = "Let") :
  column names ‘Num.x’, ‘Num.y’ are duplicated in the result

这只是一个警告，在这里你可以看到警告的内容：

> Abig
  Let    Col Num.x Num.y Num.x Num.y
1   a Yellow     1     1     1     1
2   b    Red     2     2     2     2

如果你需要那些标记为唯一的重复列名（并且我很瘦，那将是一个好主意），那么：

names(Abig)[ grep("Num", names(Abig)) ] <- 
                    paste0("Num.", seq_along( grep("Num", names(Abig)) ) )
Abig
  Let    Col Num.1 Num.2 Num.3 Num.4
1   a Yellow     1     1     1     1
2   b    Red     2     2     2     2

Answer 4

此解决方案非常类似于（@MauritsEvers＆amp; @aspiringurbandatascientist）已在此处发布的mapply解决方案，但它使用了join data.frames的不同方法。 dplyr::left_join已用于满足目的。

library(dplyr)
# Using mapply and left_join
mapply(function(x,y){
  if(y %% 2 == 1){
    left_join(x, A, by="Let")
  }else {
    left_join(x, B, by="Let")
  }
}, LIST, seq_along(LIST), SIMPLIFY = FALSE)

# [[1]]
#   Num Let    Col
# 1   1   a Yellow
# 2   2   b    Red
# 
# [[2]]
#   Num Let   Col
# 1   1   a Green
# 2   2   b  Blue

Answer 5

为了清晰起见，我重复了一些示例数据

数据

DF1 <- data.frame(Num1 = c("1","2"), Let = c("a","b"), stringsAsFactors = FALSE) DF2 <- data.frame(Num2 = c("3","4"), Let = c("a","b"), stringsAsFactors = FALSE) DF3 <- data.frame(Num3 = c("5","6"), Let = c("a","b"), stringsAsFactors = FALSE) DF4 <- data.frame(Num4 = c("7","8"), Let = c("a","b"), stringsAsFactors = FALSE) A <- data.frame(Let = c("a","b"), Col = c("Yellow","Red"), stringsAsFactors = FALSE) B <- data.frame(Let = c("a","b"), Col = c("Green","Blue"), stringsAsFactors = FALSE) LIST <- list(DF1, DF2, DF3, DF4)

<强>解决方案

library(dplyr) library(purrr) LIST_odd <- LIST[as.logical(seq_along(LIST)%%2)] LIST_even <- LIST[!as.logical(seq_along(LIST)%%2)] merge_odd <- reduce(LIST_odd,left_join,.init=A) # Let Col Num1 Num3 # 1 a Yellow 1 5 # 2 b Red 2 6 merge_even <- reduce(LIST_even,left_join,.init=B) # Let Col Num2 Num4 # 1 a Green 3 7 # 2 b Blue 4 8

如果您不想使用purrr，那么仅使用dplyr和base提供相同的结果：

Reduce(left_join,LIST_odd,A) Reduce(left_join,LIST_even,B)

或100％基数：

Reduce(function(x,y) merge(x,y,all.x=TRUE),LIST_odd,A) Reduce(function(x,y) merge(x,y,all.x=TRUE),LIST_even,B)

R：结合lapply和left_join有条件地合并数据帧

5 个答案:

概述

Tidyverse方法