在R中为表格重新编制建议

时间:2019-12-18 12:34:37

标签: r

我有这张桌子:

sample  tomato  zucchini    broccoli
a   x       x
b       x   
c   x       x

我想要这个:

a   tomato
a   broccoli
b   zucchini
c   tomato
c   broccoli

您对R做任何建议吗? 预先感谢

2 个答案:

答案 0 :(得分:2)

我会使用pivot_longer()软件包中的tidyr(或者可能更容易tidyverse)。

加载一些库:

library(tidyverse)

您的数据:

my_df <- tribble(
  ~sample,  ~tomato,  ~zucchini,    ~broccoli,
  "a",   "x",    NA,   "x",
  "b",    NA,   "x",   NA,
  "c",   "x",    NA,   "x",
)

代码:(根据评论@Ronak Shah更新)

my_df <- my_df %>% 
  # make table long format
  pivot_longer(cols = -sample,
               names_to = "vegy",
               values_to = "value",
               values_drop_na = TRUE) %>% 
  # get rid of value column
  select(-value)


my_df
# A tibble: 5 x 2
  sample vegy    
  <chr>  <chr>   
1 a      tomato  
2 a      broccoli
3 b      zucchini
4 c      tomato  
5 c      broccoli

答案 1 :(得分:0)

这里是base R的解决方案,其中apply()rep()是关键点,即

r <- apply(df, 1, function(v) names(v[-1])[which(v[-1] =="x")])
dfout <- data.frame(sample = rep(df$sample,lengths(r)),veg = unlist(r))

如此

> dfout
  sample      veg
1      a   tomato
2      a broccoli
3      b zucchini
4      c   tomato
5      c broccoli

数据

df <- structure(list(sample = structure(1:3, .Label = c("a", "b", "c"
), class = "factor"), tomato = structure(c(1L, NA, 1L), .Label = "x", class = "factor"), 
    zucchini = structure(c(NA, 1L, NA), .Label = "x", class = "factor"), 
    broccoli = structure(c(1L, NA, 1L), .Label = "x", class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

> df
  sample tomato zucchini broccoli
1      a      x     <NA>        x
2      b   <NA>        x     <NA>
3      c      x     <NA>        x