用分隔符分隔单个列 - 不同数量的分隔符

时间:2015-08-03 21:35:47

标签: r tidyr

我有一个数据框,其中包含一个带有分类的列。分类级别用/分隔。一些观察结果被归类为比其他观察结果更精细的细节。有些没有提供类别,有些则有一个,两个,三个或四个。一些虚拟数据如下:

test.data<-data.frame(obs=c("obs1","obs2","obs3","obs4","obs5","obs6","obs7",
                            "obs8","obs9","obs10"),
                      cat=c("car/ford/escort","truck/chevy","car",NA,
                            "car/volkswagon/jetta/turbo",
                            "truck/dodge/ram/superduty","truck","car/ford/escort",
                            NA,"truck/ford"))

test.data
     obs                        cat
1   obs1            car/ford/escort
2   obs2                truck/chevy
3   obs3                        car
4   obs4                       <NA>
5   obs5 car/volkswagon/jetta/turbo
6   obs6  truck/dodge/ram/superduty
7   obs7                      truck
8   obs8            car/ford/escort
9   obs9                       <NA>
10 obs10                 truck/ford

我想在此数据框的末尾添加四列 - 每个分类级别一列。当特定观察缺乏所有级别的细节时,它将返回NA

desired.data
     obs                        cat  cat1       cat2   cat3      cat4
1   obs1            car/ford/escort   car       ford escort      <NA>
2   obs2                truck/chevy truck      chevy   <NA>      <NA>
3   obs3                        car   car       <NA>   <NA>      <NA>
4   obs4                       <NA>  <NA>       <NA>   <NA>      <NA>
5   obs5 car/volkswagon/jetta/turbo   car volkswagon  jetta     turbo
6   obs6  truck/dodge/ram/superduty truck      dodge    ram superduty
7   obs7                      truck truck       <NA>   <NA>      <NA>
8   obs8            car/ford/escort   car       ford escort      <NA>
9   obs9                       <NA>  <NA>       <NA>   <NA>      <NA>
10 obs10                 truck/ford truck       ford   <NA>      <NA>

我试图使用separate无济于事。

    require(tidyr)
    separate(test.data,cat,c("cat1","cat2","cat3","cat4"))
    Error: Values not split into 4 pieces at 1, 2, 3, 4, 7, 8, 9, 10

有谁知道如何最好地解决这样的问题?

谢谢!

0 个答案:

没有答案