Question

我有一个带有12个列的制表符分隔文本文件，我将其上传到我的程序中。我继续创建另一个数据框，其结构类似于上传的数据框，并为其添加2个列。

excelfile = read.delim(ExcelPath)
matchedPictures<- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()

现在我有一个功能，我可以在其中执行以下操作：

根据条件，我获取了需要从pictureMatchNum复制到excelfile的行的行号matchedPictures。

然后我应该将行从excelfile复制到matchedPictures。到目前为止，我尝试了几种不同的方式。

一个。

rowNumber = nrow(matchedPictures) + 1
matchedPictures[rowNumber,1:12] <<- excelfile[pictureMatchNum,1:12]

湾

matchedPictures[rowNumber,1:12] <<- rbind(matchedPictures, excelfile[pictureWordMatches,1:12], make.row.names = FALSE)

2a上。似乎没有用，因为它复制了excelfile中的索引并将它们用作matchedPictures中的行名称 - 这就是为什么我决定使用rbind

2B。似乎不起作用，因为rbind需要列相同，matchedPictures有2个额外的列。

编辑开始 - 包括可重复的示例。

这是一些可重现的代码（具有较少的列和假数据）

excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)] )
matchedPictures <- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()

pictureMatchNum1 = match(1, str_detect("A", regex(excelfile$x, ignore_case = TRUE)))
rowNumber1 = nrow(matchedPictures) + 1

pictureMatchNum2 = match(1, str_detect("D", regex(excelfile$x, ignore_case = TRUE)))
rowNumber2 = nrow(matchedPictures) + 1

我尝试的两个选项是

2a上。

matchedPictures[rowNumber1,1:3] <<- excelfile[pictureMatchNum1,1:3]
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"

matchedPictures[rowNumber2,1:3] <<- excelfile[pictureMatchNum2,1:3]
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"

OR

2B。

matchedPictures[rowNumber1,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum1,1:3], make.row.names = FALSE)
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"

matchedPictures[rowNumber2,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum2,1:3], make.row.names = FALSE)
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"

编辑结束

此外，我还看到许多地方的建议，不是使用空数据帧，而是应该有向量并将数据附加到向量，然后将它们组合成数据帧。当我有这么多列并且需要有14个单独的向量并分别复制它们时，这个建议是否有效？

我可以做些什么来完成这项工作？

Answer 1

你可以

首先确定符合条件的excelfile的行索引
提取这些行
然后生成数据以填充您的列beforeName和afterName
然后将这些列附加到新数据框

示例：

excelfile <- data.frame(x = letters, y = words[length(letters)], 
    z = fruit[length(letters)])
    ## Vector of patterns:
patternVec <- c("A", "D", "M")
## Look for appropriate rows in file 'excelfile':
indexVec <- vapply(patternVec, 
        function(myPattern) which(str_detect(myPattern, 
                    regex(excelfile$x, ignore_case = TRUE))), integer(1))
## Extract these rows:
matchedPictures <- excelfile[indexVec,]
## Somehow generate the data for columns 'beforeName' and 'afterName':
## I do not know how this information is generated so I just insert 
## some dummy code here:
beforeNameVec <- c("xxx", "uuu", "mmm")
afterNameVec <- c("yyy", "www", "nnn")
## Then assign these variables:
matchedPictures$beforeName <- beforeNameVec
matchedPictures$afterName <- afterNameVec

matchedPictures
# x   y           z beforeName afterName
# a air dragonfruit        xxx       yyy
# d air dragonfruit        uuu       www
# m air dragonfruit        mmm       nnn

Answer 2

使用dplyr

可以使这更简单

library(dplyr)
library(stringr)

excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)],
stringsAsFactors = FALSE ) #add stringsAsFactors to have character columns

pictureMatch <- excelfile %>%
  #create a match column
  mutate(match = ifelse(str_detect(x,"a") | str_detect(x,'d'),1,0)) %>% 
  #filter to only the columns that match your condition
  filter(match ==1)

pictureMatch <- pictureMatch[['x']] #convert to a vector

matchedPictures <- excelfile %>%
  filter(x %in% pictureMatch) %>% #grab the rows that match your condition
  mutate(beforeName = c('xxx','uuu'), #add your names
     afterName = c('yyy','www'))

在R中，当复制到的df有2个额外的列时，如何将行从一个数据帧复制到另一个数据框？

2 个答案: