使用Tidyr将数据从长格式转换为宽格式时处理“重复标识符”

时间:2016-04-22 21:22:09

标签: r dplyr reshape2 tidyr

我似乎无法弄清楚如何解决tidyr spread错误

  

“错误:重复标识符”

首先,我将使用下面的简单数据框来展示我希望结果如何。下面的数据框按照我想要的方式工作,因为没有重复的标识符。

Names<-c("John","John","John","Chris","Chris","Chris","Sara","Sara","Sara")

Category<-c("Accommodation","Disability","Living Arrangements","Accommodation","Disability","Living Arrangements","Accommodation","Disability","Living Arrangements")

Description<-c("Apartment","Vision","Alone","House","SCI","Family","Alone","Vision","Spouse")

df<-data.frame(Names,Category,Description)

然后我使用tidyr的spread来创建宽格式数据帧:

dfS<-df%>%spread(Category,Description)

这将生成一个带有“Category”的数据框作为用“Description。

填充的列变量

但是,我实际使用的数据类似于下面的数据框,当我尝试使用tidyr的spread创建相同的整洁数据帧时,我得到“重复标识符”错误,因为设备类别有两种选择。

Names<-c("John","John","John","John","John","Chris","Chris","Chris","Chris","Chris","Sara","Sara","Sara","Sara","Sara")

Category<-c("Accommodation","Disability","Equipment","Equipment","Living Arrangements","Accommodation","Disability","Equipment","Equipment","Living Arrangements","Accommodation","Disability","Equipment","Equipment","Living Arrangements")

Description<-c("Apartment","Vision","Scooter","Walker","Alone","House","SCI","Walker","Bed","Family","Alone","Vision","Cane","Bed","Spouse")

df<-data.frame(Names,Category,Description)

我无法找到解决这种情况的简单方法。我在这里检查了其他帖子,大多数都太复杂或不起作用。我尝试了这个选项:

df2<-df%>%group_by(Category,Description)%>%
     summarise(Category=toString(unique(Category)))%>%
     spread(Category,Description,fill='')

但结果是

  

“错误:无法修改分组变量”

我从另一个帖子复制了代码,所以我承认我不确定它是如何工作的并且可能犯了一个错误。我还是比较新的R,尤其是tidyr所以我希望有人可以提供一个简单的解决方案,这将允许我在上面的第一个例子中使用tidyr的扩展来创建整洁的数据帧,最好是在dplyr或reshape2的帮助下。

0 个答案:

没有答案