如何将列表值列拆分为多列?

时间:2018-04-26 16:09:22

标签: r list split dplyr

我有以下情况,列power_dbm0的值为列表。所有元素都是长度为11的列表。

# A tibble: 10 x 2
   real_pat power_dbm0
   <chr>    <list>    
 1 am       <dbl [11]>
 2 fax      <dbl [11]>
 3 fp       <dbl [11]>
 4 fpw      <dbl [11]>

我想知道如何分割这些值,因为每个订单都是新列。最好,我喜欢类似dplyr的解决方案。我已尝试使用unnestseparatetidyr函数的某些解决方案,但未成功。

提前致谢,

关注数据:

structure(list(real_pat = c("am", "fax", "fp", "fpw"), power_dbm0 = list(
    structure(c(0.0142857142857143, 0.0742857142857143, 0.111428571428571, 
    0.138571428571429, 0.208571428571429, 0.278571428571429, 
    0.368571428571429, 0.508571428571429, 0.648571428571429, 
    0.771428571428571, 0.871428571428571), .Names = c("0%", "10%", 
    "20%", "30%", "40%", "50%", "60%", "70%", "80%", "90%", "100%"
    )), structure(c(0.342857142857143, 0.342857142857143, 0.342857142857143, 
    0.342857142857143, 0.342857142857143, 0.342857142857143, 
    0.342857142857143, 0.342857142857143, 0.342857142857143, 
    0.342857142857143, 0.342857142857143), .Names = c("0%", "10%", 
    "20%", "30%", "40%", "50%", "60%", "70%", "80%", "90%", "100%"
    )), structure(c(0.0142857142857143, 0.622857142857143, 0.808571428571429, 
    0.851428571428571, 0.857142857142857, 0.871428571428571, 
    0.874285714285714, 0.885714285714286, 0.894285714285714, 
    0.911428571428571, 0.914285714285714), .Names = c("0%", "10%", 
    "20%", "30%", "40%", "50%", "60%", "70%", "80%", "90%", "100%"
    )), structure(c(0.514285714285714, 0.514285714285714, 0.514285714285714, 
    0.514285714285714, 0.514285714285714, 0.514285714285714, 
    0.514285714285714, 0.514285714285714, 0.514285714285714, 
    0.514285714285714, 0.514285714285714), .Names = c("0%", "10%", 
    "20%", "30%", "40%", "50%", "60%", "70%", "80%", "90%", "100%"
    )))), .Names = c("real_pat", "power_dbm0"), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

3 个答案:

答案 0 :(得分:6)

1)这是一个单行基础解决方案:

with(dd, do.call("rbind", setNames(power_dbm0, real_pat)))

,并提供:

            0%        10%       20%       30%       40%       50%       60%
am  0.01428571 0.07428571 0.1114286 0.1385714 0.2085714 0.2785714 0.3685714
fax 0.34285714 0.34285714 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571
fp  0.01428571 0.62285714 0.8085714 0.8514286 0.8571429 0.8714286 0.8742857
fpw 0.51428571 0.51428571 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857
          70%       80%       90%      100%
am  0.5085714 0.6485714 0.7714286 0.8714286
fax 0.3428571 0.3428571 0.3428571 0.3428571
fp  0.8857143 0.8942857 0.9114286 0.9142857
fpw 0.5142857 0.5142857 0.5142857 0.5142857

2)或将real_pat作为列而不是名称:

with(dd, data.frame(real_pat, do.call("rbind", power_dbm0), check.names = FALSE))

,并提供:

  real_pat         0%        10%       20%       30%       40%       50%
1       am 0.01428571 0.07428571 0.1114286 0.1385714 0.2085714 0.2785714
2      fax 0.34285714 0.34285714 0.3428571 0.3428571 0.3428571 0.3428571
3       fp 0.01428571 0.62285714 0.8085714 0.8514286 0.8571429 0.8714286
4      fpw 0.51428571 0.51428571 0.5142857 0.5142857 0.5142857 0.5142857
        60%       70%       80%       90%      100%
1 0.3685714 0.5085714 0.6485714 0.7714286 0.8714286
2 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571
3 0.8742857 0.8857143 0.8942857 0.9114286 0.9142857
4 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857

3)使用dplyr我们可以这样写:

library(dplyr)
dd %>% { bind_cols(select(., real_pat), bind_rows(!!!.$power_dbm0)) }

,并提供:

# A tibble: 4 x 12
  real_pat   `0%`  `10%` `20%` `30%` `40%` `50%` `60%` `70%` `80%` `90%` `100%`
  <chr>     <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
1 am       0.0143 0.0743 0.111 0.139 0.209 0.279 0.369 0.509 0.649 0.771  0.871
2 fax      0.343  0.343  0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343  0.343
3 fp       0.0143 0.623  0.809 0.851 0.857 0.871 0.874 0.886 0.894 0.911  0.914
4 fpw      0.514  0.514  0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514  0.514

3a)或使用.id=的{​​{1}}参数和magrittr bind_rows

%$%

,并提供:

library(dplyr)
library(magrittr)

dd %$%
   setNames(power_dbm0, real_pat) %$%
   bind_rows(!!!., .id = "real_pat")

3b)或没有# A tibble: 4 x 12 real_pat `0%` `10%` `20%` `30%` `40%` `50%` `60%` `70%` `80%` `90%` `100%` <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 am 0.0143 0.0743 0.111 0.139 0.209 0.279 0.369 0.509 0.649 0.771 0.871 2 fax 0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343 3 fp 0.0143 0.623 0.809 0.851 0.857 0.871 0.874 0.886 0.894 0.911 0.914 4 fpw 0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514

%$%

,并提供:

library(dplyr)

dd %>%
   { setNames(.$power_dbm0, .$real_pat) } %>%
   { bind_rows(!!!., .id = "real_pat") }

答案 1 :(得分:3)

1)我们可以transpose'power_dbm0'列,unlist nested列表,然后与第一列绑定

library(tidyverse)
df1 %>%
   pull(power_dbm0) %>%
   transpose %>%
   map_df(unlist) %>% 
   bind_cols(df1[1], .)
# A tibble: 4 x 12
#   real_pat   `0%`  `10%` `20%` `30%` `40%` `50%` `60%` `70%` `80%` `90%` `100%`
#  <chr>     <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
#1 am       0.0143 0.0743 0.111 0.139 0.209 0.279 0.369 0.509 0.649 0.771  0.871
#2 fax      0.343  0.343  0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343  0.343
#3 fp       0.0143 0.623  0.809 0.851 0.857 0.871 0.874 0.886 0.894 0.911  0.914
#4 fpw      0.514  0.514  0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514  0.514

2)或另一个选项是melt,然后执行spread。在这里,我们还包括unnest作为帖子中提到的OP

library(tidyverse)
library(reshape2)
df1 %>% 
    mutate(power_dbm0 = map(power_dbm0, ~melt(.x) %>% 
                          rownames_to_column('rn') %>%
                          mutate(rn = factor(rn, levels = rn)))) %>% 
    unnest %>% 
    spread(rn, value)
# A tibble: 4 x 12
#  real_pat   `0%`  `10%` `20%` `30%` `40%` `50%` `60%` `70%` `80%` `90%` `100%`
#  <chr>     <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
#1 am       0.0143 0.0743 0.111 0.139 0.209 0.279 0.369 0.509 0.649 0.771  0.871
#2 fax      0.343  0.343  0.343 0.343 0.343 0.343 0.343 0.343 0.343 0.343  0.343
#3 fp       0.0143 0.623  0.809 0.851 0.857 0.871 0.874 0.886 0.894 0.911  0.914
#4 fpw      0.514  0.514  0.514 0.514 0.514 0.514 0.514 0.514 0.514 0.514  0.514

3)pmapspread

df1 %>%
     pmap_df(~ tibble(real_pat = ..1, nm = names(..2), val = ..2))  %>%
     spread(nm, val)

注意:所有解决方案都使用dplyr

中的tidyverse和相关包

4)或者我们可以unlist'power_dbm0',创建一个matrix,因为它们都是相等的长度,然后与第一列绑定({{1} }) - 如果需要,可以更改列名

base R

答案 2 :(得分:2)

选项可以是:

cbind(df[1],t(sapply(df$power_dbm0,function(x)x)))

# real_pat         0%        10%       20%       30%       40%       50%       60%       70%       80%       90%      100%
# 1       am 0.01428571 0.07428571 0.1114286 0.1385714 0.2085714 0.2785714 0.3685714 0.5085714 0.6485714 0.7714286 0.8714286
# 2      fax 0.34285714 0.34285714 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571 0.3428571
# 3       fp 0.01428571 0.62285714 0.8085714 0.8514286 0.8571429 0.8714286 0.8742857 0.8857143 0.8942857 0.9114286 0.9142857
# 4      fpw 0.51428571 0.51428571 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857 0.5142857

使用simplify2array的附加选项(根据@@ G.Grothendieck的反馈):

cbind(df[1],t(simplify2array(df$power_dbm0)))