使用排名重塑数据框架

时间:2014-01-31 19:51:03

标签: r reshape

我正在尝试将这个排名数据重新塑造成更具图形性的东西......也许类似于:ggplot(party2,aes(x =偏好,y =排名,颜色= id)+ geom_line()。首先我必须重塑它虽然。

这是迄今为止的数据:

> head(party)
  Theme Music/DJ Drink deals People Location
3     3        4           5      1        2
4     2        3           5      1        4
5     5        4           3      1        2
6     4        1           5      2        3

目标是使数据看起来像这样:

id Preference     Ranking
1    Theme           3
1    Music/DJ        4
1    Drink deals     5
1    People          1
1    Location        2
2    Theme           2
2    Music/DJ        3
2    Drink deals     5

为了重塑数据,我使用了来自此链接的Hadley代码:How to reshape this dataframe with the reshape package,但我仍遇到问题。我想我很亲密。

到目前为止我的代码是:

party.pref<-c("Theme", "Music/DJ", "Drink deals", "People", "Location")
party<-data[,party.pref]
party<-na.omit(party)
party2<-cbind(party, id=seq(1,nrow(party),1)) # Add IDs column
gp<-melt(party2, id="id", measured=party.pref)
dcast(gp, ... ~party.pref)

它就像这样出现:

  id    variable   Drink deals Location Music/DJ People Theme
  1       Theme        <NA>     <NA>     <NA>   <NA>     3
  1    Music/DJ        <NA>     <NA>     <NA>   <NA>     4
  1 Drink deals        <NA>     <NA>     <NA>   <NA>     5
  1      People        <NA>     <NA>     <NA>   <NA>     1
  1    Location        <NA>     <NA>     <NA>   <NA>     2
  2       Theme        <NA>     <NA>        2   <NA>  <NA>

正如你所看到的,如果所有这些因素列只是变得“排名”并且我摆脱了所有的NA,我会得到答案,但我不知道该怎么做。我觉得我在“dcast”或“融化”上做错了什么,但我不确定是哪一个。

非常感谢任何帮助,谢谢!

2 个答案:

答案 0 :(得分:1)

您需要使用melt,而不是dcastdcast用于从长格式转换为宽格式,您正试图做相反的事情。

party <- cbind(id=1:nrow(party), party) # add id
melt(party, id.vars="id")               # melt, indicate "id" should be a column in result     

这会产生:

#     id variable value
#  1   1    Theme     3
#  2   2    Theme     4
#  3   3    Theme     5
#  4   4    Theme     6
#  5   1 Music.DJ     3
#  6   2 Music.DJ     2
# ...
# 20  4   People     2
# 21  1 Location     2
# 22  2 Location     4
# 23  3 Location     2
# 24  4 Location     3

答案 1 :(得分:1)

Alex,只需再添加一个信息。

如果这些行表示某些内容并且您不想丢失信息,则应再添加一列来命名它们。然后你融化并重铸。

party <- read.table(text=
"Theme Music/DJ Drink/deals People Location
     3        4           5      1        2
     2        3           5      1        4
     5        4           3      1        2
     4        1           5      2        3", header=TRUE)

### Add one more column with the meaning of each line:
party$ranking <- c("ranking1", "ranking2", "ranking3", "ranking4")

party

#This wil give you:
     Theme Music.DJ Drink.deals People Location  ranking
1     3        4           5      1        2 ranking1
2     2        3           5      1        4 ranking2
3     5        4           3      1        2 ranking3
4     4        1           5      2        3 ranking4

#then melt and dcast
library(reshape2)
ranking <- melt(party)
ranking <- dcast(ranking, variable~ranking)
ranking

#this will give you
    variable ranking1 ranking2 ranking3 ranking4
1       Theme        3        2        5        4
2    Music.DJ        4        3        4        1
3 Drink.deals        5        5        3        5
4      People        1        1        1        2
5    Location        2        4        2        3