pivot_wider/仅从列

时间:2021-08-02 00:19:23

标签: r dataframe dplyr reshape

我只想将单个列中的一个级别/值的子集重塑为宽,但将选定级别保留在原始列中。

在此示例数据中,food 列中的 'rice' 和 'beans' 值没有“类型”特征。 我想保留原始列“food”及其级别“rice”和“beans”,同时将其他值设置为宽。

数据

set.seed(1)
df<-tibble(index=sample(1:5, 10, replace=TRUE),
            food=c(rep('fruit', 4),rep('meat', 4), 'rice', 'beans'),
            type=c('apple', 'apple', 'banana', 'banana', 'steak', 'steak', 't-bone', 't-bone', NA, NA))

# A tibble: 10 x 3
   index food  type  
   <int> <chr> <chr> 
 1     1 fruit apple 
 2     4 fruit apple 
 3     1 fruit banana
 4     2 fruit banana
 5     5 meat  steak 
 6     3 meat  steak 
 7     2 meat  t-bone
 8     3 meat  t-bone
 9     3 rice  NA    
10     1 beans NA    

所需的输出如下:

output<-structure(list(index = c(1L, 1L, 4L, 2L, 5L, 3L, 3L), fruit = c("apple", 
"banana", "apple", "banana", NA, NA, NA), meat = c(NA, NA, NA, 
"t-bone", "steak", "steak", "t-bone"), food = c("beans", NA, 
NA, NA, NA, "rice", NA)), row.names = c(NA, -7L), class = c("tbl_df", 
"tbl", "data.frame"))

output
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  NA     beans
2     1 banana NA     NA   
3     4 apple  NA     NA   
4     2 banana t-bone NA   
5     5 NA     steak  NA   
6     3 NA     steak  rice 
7     3 NA     t-bone NA  

我可以通过将 'rice' 和 'beans' 值移动到 'type' 列并在 'food' 列中创建相应的 'food' 级别来手动完成。除了费力和非系统的转换之外,我得到了一个意想不到的输出,其中包含重复的“beans”和“rice”值:

df1%>%mutate(type=coalesce(type, food),
             food=replace(food, type %in% c('rice', 'beans'), 'food'))%>%
        pivot_wider(id_cols = index, names_from = c(food), values_from = c(type))%>%
        unnest
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  NA     beans
2     1 banana NA     beans ###<-
3     4 apple  NA     NA   
4     2 banana t-bone NA   
5     5 NA     steak  NA   
6     3 NA     steak  rice 
7     3 NA     t-bone rice ###<-

我想知道是否有更简单、更安全的方法来使用 pivot_wider 即时完成

2 个答案:

答案 0 :(得分:2)

我们可能会使用 coalescereplace

library(dplyr)
library(tidyr)
library(data.table)
df %>% 
    mutate(type = coalesce(type, food), 
           food =  replace(food, food == type, 'food'),
            rn = rowid(index, food)) %>%
    pivot_wider(names_from = food, values_from = type) %>% 
    select(-rn)
# A tibble: 7 x 4
  index fruit  meat   food 
  <int> <chr>  <chr>  <chr>
1     1 apple  <NA>   beans
2     4 apple  <NA>   <NA> 
3     1 banana <NA>   <NA> 
4     2 banana t-bone <NA> 
5     5 <NA>   steak  <NA> 
6     3 <NA>   steak  rice 
7     3 <NA>   t-bone <NA> 

答案 1 :(得分:1)

您可以使用 -

library(dplyr)
library(tidyr)

df %>%
  mutate(type = if_else(food %in% c('rice', 'beans'), food, type), 
         food = replace(food, food %in% c('rice', 'beans'), 'food')) %>%
  group_by(index, food) %>%
  mutate(row  = row_number()) %>%
  ungroup %>%
  pivot_wider(names_from = food, values_from = type) %>%
  select(-row)
  
#  index fruit  meat   food 
#  <int> <chr>  <chr>  <chr>
#1     1 apple  NA     beans
#2     4 apple  NA     NA   
#3     1 banana NA     NA   
#4     2 banana t-bone NA   
#5     5 NA     steak  NA   
#6     3 NA     steak  rice 
#7     3 NA     t-bone NA