R中的多行成多列数据帧,其中不同的第一列作为列标题

时间:2020-03-16 14:21:13

标签: r dataframe transform

我是R的新手,我想将R中的多行转换为多列数据帧,并以不同的第一列作为列标题

例如:

> dat <- read.table(text = "Company    Loc 100000012,104 100000012,105
> 100000012,107 
> 100000012,102 
> 100000012,166 
> 100000012,126 
> 100000012,169
> 100000012,42 
> 100000012,43 
> 100123545,50 
> 100123600,21 
> 100123600,10",
> header = TRUE)

转换为以下

> 100000012,100123545,100123600 
> 104,50,21 
> 105,,10 
> 107,, 
> 102,, 
> 166,,
> 126,, 
> 169,, 
> 42,, 
> 43,,

非常感谢!

3 个答案:

答案 0 :(得分:0)

尝试一下:

  1. 跳过标题,因为缺少分隔符。
  2. 将分隔符设置为“,”
  3. 手动命名列
dat <- read.table(text = "Company    Loc 100000012,104 100000012,105
100000012,107 
100000012,102 
100000012,166 
100000012,126 
100000012,169
100000012,42 
100000012,43 
100123545,50 
100123600,21 
100123600,10", skip = 1, sep = ",")
names(dat) <- c("Company", "Loc")
dat
#>      Company Loc
#> 1  100000012 107
#> 2  100000012 102
#> 3  100000012 166
#> 4  100000012 126
#> 5  100000012 169
#> 6  100000012  42
#> 7  100000012  43
#> 8  100123545  50
#> 9  100123600  21
#> 10 100123600  10

reprex package(v0.3.0)于2020-03-16创建

答案 1 :(得分:0)

这是基本的R解决方案,使用lapply + split,即

datout <- data.frame(t(do.call(rbind,
                               lapply(u<-lapply(split(dat,dat$Company),`[[`,2),
                                      `length<-`,
                                      max(lengths(u))))),
                     check.names = FALSE)

这样

> datout
   100000012  100123545  100123600
1        104         50         21
2        105         NA         10
3        107         NA         NA
4        102         NA         NA
5        166         NA         NA
6        126         NA         NA
7        169         NA         NA
8         42         NA         NA
9         43         NA         NA

数据

dat <- structure(list(Company = c(100000012L, 100000012L, 100000012L, 
100000012L, 100000012L, 100000012L, 100000012L, 100000012L, 100000012L, 
100123545L, 100123600L, 100123600L), Loc = c(104L, 105L, 107L, 
102L, 166L, 126L, 169L, 42L, 43L, 50L, 21L, 10L)), class = "data.frame", row.names = c(NA, 
-12L))

答案 2 :(得分:0)

这是一种tidyverse的方法:

dat %>%
  mutate(rn = row_number()) %>%
  pivot_wider(id_cols = c(Company, rn), names_from = Company, values_from = Loc) %>%
  as.data.frame() %>%
  select(-rn) %>%
  mutate_all(~(.[order(is.na(.))])) %>%
  filter_all(any_vars(!is.na(.))) %>%
  unite(result, everything(), sep = ',')

输出

     result
1 104,50,21
2 105,NA,10
3 107,NA,NA
4 102,NA,NA
5 166,NA,NA
6 126,NA,NA
7 169,NA,NA
8  42,NA,NA
9  43,NA,NA