我有一个带有12个列的制表符分隔文本文件,我将其上传到我的程序中。我继续创建另一个数据框,其结构类似于上传的数据框,并为其添加2个列。
excelfile = read.delim(ExcelPath)
matchedPictures<- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()
现在我有一个功能,我可以在其中执行以下操作:
pictureMatchNum
复制到excelfile
的行的行号matchedPictures
。然后我应该将行从excelfile
复制到matchedPictures
。到目前为止,我尝试了几种不同的方式。
一个。
rowNumber = nrow(matchedPictures) + 1
matchedPictures[rowNumber,1:12] <<- excelfile[pictureMatchNum,1:12]
湾
matchedPictures[rowNumber,1:12] <<- rbind(matchedPictures, excelfile[pictureWordMatches,1:12], make.row.names = FALSE)
2a上。似乎没有用,因为它复制了excelfile
中的索引并将它们用作matchedPictures
中的行名称 - 这就是为什么我决定使用rbind
2B。似乎不起作用,因为rbind需要列相同,matchedPictures
有2个额外的列。
编辑开始 - 包括可重复的示例。
这是一些可重现的代码(具有较少的列和假数据)
excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)] )
matchedPictures <- excelfile[0,]
matchedPictures$beforeName <- character()
matchedPictures$afterName <- character()
pictureMatchNum1 = match(1, str_detect("A", regex(excelfile$x, ignore_case = TRUE)))
rowNumber1 = nrow(matchedPictures) + 1
pictureMatchNum2 = match(1, str_detect("D", regex(excelfile$x, ignore_case = TRUE)))
rowNumber2 = nrow(matchedPictures) + 1
我尝试的两个选项是
2a上。
matchedPictures[rowNumber1,1:3] <<- excelfile[pictureMatchNum1,1:3]
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"
matchedPictures[rowNumber2,1:3] <<- excelfile[pictureMatchNum2,1:3]
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"
OR
2B。
matchedPictures[rowNumber1,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum1,1:3], make.row.names = FALSE)
matchedPictures[rowNumber1,"beforeName"] <<- "xxx"
matchedPictures[rowNumber1,"afterName"] <<- "yyy"
matchedPictures[rowNumber2,1:3] <<- rbind(matchedPictures, excelfile[pictureMatchNum2,1:3], make.row.names = FALSE)
matchedPictures[rowNumber2,"beforeName"] <<- "uuu"
matchedPictures[rowNumber2,"afterName"] <<- "www"
编辑结束
此外,我还看到许多地方的建议,不是使用空数据帧,而是应该有向量并将数据附加到向量,然后将它们组合成数据帧。当我有这么多列并且需要有14个单独的向量并分别复制它们时,这个建议是否有效?
我可以做些什么来完成这项工作?
答案 0 :(得分:0)
你可以
excelfile
的行索引beforeName
和afterName
示例:
excelfile <- data.frame(x = letters, y = words[length(letters)],
z = fruit[length(letters)])
## Vector of patterns:
patternVec <- c("A", "D", "M")
## Look for appropriate rows in file 'excelfile':
indexVec <- vapply(patternVec,
function(myPattern) which(str_detect(myPattern,
regex(excelfile$x, ignore_case = TRUE))), integer(1))
## Extract these rows:
matchedPictures <- excelfile[indexVec,]
## Somehow generate the data for columns 'beforeName' and 'afterName':
## I do not know how this information is generated so I just insert
## some dummy code here:
beforeNameVec <- c("xxx", "uuu", "mmm")
afterNameVec <- c("yyy", "www", "nnn")
## Then assign these variables:
matchedPictures$beforeName <- beforeNameVec
matchedPictures$afterName <- afterNameVec
matchedPictures
# x y z beforeName afterName
# a air dragonfruit xxx yyy
# d air dragonfruit uuu www
# m air dragonfruit mmm nnn
答案 1 :(得分:0)
使用dplyr
可以使这更简单library(dplyr)
library(stringr)
excelfile <- data.frame(x = letters, y = words[length(letters)], z= fruit[length(letters)],
stringsAsFactors = FALSE ) #add stringsAsFactors to have character columns
pictureMatch <- excelfile %>%
#create a match column
mutate(match = ifelse(str_detect(x,"a") | str_detect(x,'d'),1,0)) %>%
#filter to only the columns that match your condition
filter(match ==1)
pictureMatch <- pictureMatch[['x']] #convert to a vector
matchedPictures <- excelfile %>%
filter(x %in% pictureMatch) %>% #grab the rows that match your condition
mutate(beforeName = c('xxx','uuu'), #add your names
afterName = c('yyy','www'))