我有一个数据框,其中包含一个带有分类的列。分类级别用/分隔。一些观察结果被归类为比其他观察结果更精细的细节。有些没有提供类别,有些则有一个,两个,三个或四个。一些虚拟数据如下:
test.data<-data.frame(obs=c("obs1","obs2","obs3","obs4","obs5","obs6","obs7",
"obs8","obs9","obs10"),
cat=c("car/ford/escort","truck/chevy","car",NA,
"car/volkswagon/jetta/turbo",
"truck/dodge/ram/superduty","truck","car/ford/escort",
NA,"truck/ford"))
test.data
obs cat
1 obs1 car/ford/escort
2 obs2 truck/chevy
3 obs3 car
4 obs4 <NA>
5 obs5 car/volkswagon/jetta/turbo
6 obs6 truck/dodge/ram/superduty
7 obs7 truck
8 obs8 car/ford/escort
9 obs9 <NA>
10 obs10 truck/ford
我想在此数据框的末尾添加四列 - 每个分类级别一列。当特定观察缺乏所有级别的细节时,它将返回NA
。
desired.data
obs cat cat1 cat2 cat3 cat4
1 obs1 car/ford/escort car ford escort <NA>
2 obs2 truck/chevy truck chevy <NA> <NA>
3 obs3 car car <NA> <NA> <NA>
4 obs4 <NA> <NA> <NA> <NA> <NA>
5 obs5 car/volkswagon/jetta/turbo car volkswagon jetta turbo
6 obs6 truck/dodge/ram/superduty truck dodge ram superduty
7 obs7 truck truck <NA> <NA> <NA>
8 obs8 car/ford/escort car ford escort <NA>
9 obs9 <NA> <NA> <NA> <NA> <NA>
10 obs10 truck/ford truck ford <NA> <NA>
我试图使用separate
无济于事。
require(tidyr)
separate(test.data,cat,c("cat1","cat2","cat3","cat4"))
Error: Values not split into 4 pieces at 1, 2, 3, 4, 7, 8, 9, 10
有谁知道如何最好地解决这样的问题?
谢谢!