我有这个df:
%
并希望将其格式化为两列数据框,每个 KEGGnumber Cor Colors
X1 C00095 -2.623973e-01 RED
X2 C17714, C00044 -2.241113e-01 RED
X3 C00033 -3.066684e-01 RED
个与KEGGnumber
匹配。它看起来像这样:
Color
基本上,新数据框会将旧数据框的行与多个KEGGnumber Colors
C00095 RED
C17714 RED
C00044 RED
C00033 RED
分开并将它们分开,同时为每个数据框保持相同的KEGGnumber
。
答案 0 :(得分:1)
这可能会或可能不会重复,但可以在此处找到一个非常相似的问题:Splitting a string into new rows in R。
将此示例简单地改编为您的案例:
library(splitstackshape)
library(data.table)
df2 <- as.data.frame(cSplit(as.data.frame(ls), "KEGGnumber",
sep = ",", direction = "long"))
df2
KEGGnumber Cor Colors
1 c00095 -0.2623973 RED
2 c17714 -0.2241113 RED
3 c00044 -0.2241113 RED
4 c00033 -0.3066684 RED
答案 1 :(得分:1)
tidyr
让这很容易:
library(tidyr)
df %>% separate_rows(KEGGnumber)
## Cor Colors KEGGnumber
## 1 -0.2623973 RED C00095
## 2 -0.2241113 RED C17714
## 3 -0.2241113 RED C00044
## 4 -0.3066684 RED C00033
如果您愿意,请摘下Cor
列。
不太漂亮的基本选项:
do.call(rbind,
Map(function(x, y){data.frame(KEGGnumber = x, Colors = y)},
strsplit(as.character(df$KEGGnumber), ', '),
df$Colors))
## KEGGnumber Colors
## 1 C00095 RED
## 2 C17714 RED
## 3 C00044 RED
## 4 C00033 RED