我有以下几点:
df <- tibble::tribble(
~Sample_name, ~CRT, ~SR, ~`Bcells,DendriticCells,Macrophage`,
"S1", 0.079, 0.592, "0.077,0.483,0.555",
"S2", 0.082, 0.549, "0.075,0.268,0.120"
)
df
#> # A tibble: 2 × 4
#> Sample_name CRT SR `Bcells,DendriticCells,Macrophage`
#> <chr> <dbl> <dbl> <chr>
#> 1 S1 0.079 0.592 0.077,0.483,0.555
#> 2 S2 0.082 0.549 0.075,0.268,0.120
请注意,逗号分隔的第三列。如何将df
转换为这种整洁的形式:
Sample_name CRT SR Score Celltype
S1 0.079 0.592 0.077 Bcells
S1 0.079 0.592 0.483 DendriticCells
S1 0.079 0.592 0.555 Macrophage
S2 0.082 0.549 0.075 Bcells
S2 0.082 0.549 0.268 DendriticCells
S2 0.082 0.549 0.120 Macrophage
答案 0 :(得分:2)
我们可以使用separate
:
df %>%
separate(col = `Bcells,DendriticCells,Macrophage`,
into = strsplit('Bcells,DendriticCells,Macrophage', ',')[[1]],
sep = ',') %>%
gather(Celltype, score, Bcells:Macrophage)
# # A tibble: 6 × 5
# Sample_name CRT SR Celltype score
# <chr> <dbl> <dbl> <chr> <chr>
# 1 S1 0.079 0.592 Bcells 0.077
# 2 S2 0.082 0.549 Bcells 0.075
# 3 S1 0.079 0.592 DendriticCells 0.483
# 4 S2 0.082 0.549 DendriticCells 0.268
# 5 S1 0.079 0.592 Macrophage 0.555
# 6 S2 0.082 0.549 Macrophage 0.120
没有硬编码:
cn <- colnames(df)[ncol(df)]
df %>%
separate_(col = cn, into = strsplit(cn, ',')[[1]], sep = ',') %>%
gather_('Celltype', 'score', strsplit(cn, ',')[[1]])