Question

我认为必须有一种简单的方法来折叠输入表以产生所需的输出表，但是我对此感到空白。

library(tidyverse)
input <- tribble(
  ~name, ~value,
  "animal", "pig",
  "animal", "dog",
  "animal", "cat",
  "plant", "tree",
  "plant", "bush",
  "plant", "flower"
)

output <- tribble(
  ~animal, ~plant,
  "pig", "tree",
  "dog", "bush",
  "cat", "flower"
)

在input中，col1包含col2中每个值的变量标签。在output中，表被重新格式化，以使input$value中的值出现在根据input$name中的相应元素命名的列中。

Answer 1

我们可以使用unstack中的base R（不使用任何软件包）

unstack(input, value ~ name)
#   animal  plant
#1    pig   tree
#2    dog   bush
#3    cat flower

或者使用dcast中的data.table

library(data.table)
dcast(input, rowid(name)~ name)[,-1]
#    animal  plant
#1    pig   tree
#2    dog   bush
#3    cat flower

或使用dplyr

library(dplyr)
input %>% 
    group_split(name, keep = FALSE) %>% 
    bind_cols

或使用split

split(input$value, input$name) %>% 
         bind_cols

或带有spread

的另一个选项

library(tidyr)
input %>%
   mutate(rn = rowid(name)) %>% 
   spread(name, value)

Answer 2

我们可以为每个row_number()创建一个name，然后为spread

library(dplyr)
library(tidyr)

input %>%
  group_by(name) %>%
  mutate(row = row_number()) %>%
  spread(name, value) %>%
  select(-row)

#   animal plant 
#  <chr>  <chr> 
#1  pig    tree  
#2  dog    bush  
#3  cat    flower

折叠2列数据框，其中col1包含名称，col2包含值

2 个答案: