使用dplyr

Question

我有一个如下所示的数据框：

set <- data.frame("id"=c("one", "two","three"), "line_number"=c("1", "2", "3"), 
              "content_type"=c("paragraph", "paragraph","paragraph"), 
              "text"=c("this is a sample","first batch is:", "second batch is:"), 
              "section"=c("introduction","content","summary"))

所以看起来像：

  set(view)
  id       line_number      content_type     text                   section
  one           1            paragraph       this is a sample     introduction
  two           2            paragraph       first batch is:        content
  three         3            paragraph       second batch is:       summary

我想在此数据框的顶部添加一行，其中只有一列内容，现在看起来像：

  set(view)
  id       line_number      content_type     text                   section
                                             Sample Report
  one           1            paragraph       this is a sample     introduction
  two           2            paragraph       first batch is:        content
  three         3            paragraph       second batch is:       summary

R可以在任何需要的地方自动填写NA。

我尝试使用rbind，但它不会让我这样做，因为列数不匹配。还有其他办法吗？也许是一个循环？

谢谢！我真的很感激。

Answer 1

这应该这样做

set <- data.frame("id"=c("one", "two","three"), "line_number"=c("1", "2", "3"), 
                  "content_type"=c("paragraph", "paragraph","paragraph"), 
                  "text"=c("this is a sample","first batch is:", "second batch is:"), 
                  "section"=c("introduction","content","summary"), stringsAsFactors = FALSE)
x <- data.frame(text = "Sample Report", stringsAsFactors = FALSE)
dplyr::bind_rows(set,x )

Answer 2

基础R

set2[setdiff(names(set),names(set2))] <- NA
rbind(set2,set)
#               text    id line_number content_type      section
# 1    Sample Report  <NA>        <NA>         <NA>         <NA>
# 2 this is a sample   one           1    paragraph introduction
# 3  first batch is:   two           2    paragraph      content
# 4 second batch is: three           3    paragraph      summary

或对于不会更改set2的单行内容：

rbind('[<-'(set2,setdiff(names(set),names(set2)),value= NA),set)

<强> dplyr

dplyr::bind_rows(set2,set)
#               text    id line_number content_type      section
# 1    Sample Report  <NA>        <NA>         <NA>         <NA>
# 2 this is a sample   one           1    paragraph introduction
# 3  first batch is:   two           2    paragraph      content
# 4 second batch is: three           3    paragraph      summary

<强> data.table

data.table::rbindlist(list(set2,set),fill=TRUE)
#                text    id line_number content_type      section
# 1:    Sample Report    NA          NA           NA           NA
# 2: this is a sample   one           1    paragraph introduction
# 3:  first batch is:   two           2    paragraph      content
# 4: second batch is: three           3    paragraph      summary

关于列顺序的说明

列顺序由第一个data.frame给出，这就是列text向左移动的原因。将[names(set)]添加到任何答案中以获取原始订单。

数据

set <- data.frame("id"=c("one", "two","three"), "line_number"=c("1", "2", "3"), "content_type"=c("paragraph", "paragraph","paragraph"), "text"=c("this is a sample","first batch is:", "second batch is:"), "section"=c("introduction","content","summary")) set2 <- data.frame(text ="Sample Report")

Answer 3

已经给出的其他替代方案：

set <- data.frame("id"=c("one", "two","three"), "line_number"=c("1", "2", "3"), 
              "content_type"=c("paragraph", "paragraph","paragraph"), 
              "text"=c("this is a sample","first batch is:", "second batch is:"), 
              "section"=c("introduction","content","summary"), stringsAsFactors = FALSE)
x <- data.frame(text = "Sample Report", stringsAsFactors = FALSE)

使用dplyr

library(dplyr)
d1 <- full_join(set,x)
d1 <- d1 %>% arrange(!is.na(line_number),line_number)

第二步有助于确保您在第一行获得“样本报告”。

使用基数R

d2 <- merge(set,x,all = T)
d2 <- d2[order(d2$line_number,na.last=F),]

同样，上面第二行代码将有助于确保您在第一行获得“样本报告”。在这两种情况下，合并变量都没有明确说明（但默认情况下R采用两个数据集共有的变量，即“text”变量）。

R：如何添加一列具有与数据框其余部分不同的列数？

3 个答案:

使用dplyr

使用基数R