如何将列中的分类数据分成R中的新不同列?

时间:2014-03-23 05:24:28

标签: r reshape

我有数据(prop_attack),其中一部分看起来像这样

attack.type    proportion   class
          4     0.8400000    high
          5     0.9733333    high
          6     0.9385151    high
          7     0.9228659    high
          8     0.6187500    high
          9     0.9219331    high
          1     0.8364853     mid 
          2     0.9896870     mid 
          3     0.9529760     mid
          4     0.6666667     mid
          5     0.9965636     mid
          6     0.9687825     mid

attack.type实际上是分类的,它们只被分配了数字1-9。我想创建一个重新排列数据的表

weap.type     high                                     mid  
1             corresponding proportion value           corresponding proportion value
2             corresponding proportion value           corresponding proportion value
3             corresponding proportion value           corresponding proportion value
4             corresponding proportion value           corresponding proportion value
5             corresponding proportion value           corresponding proportion value
6             etc.
7
8
9 

有关如何执行此操作的任何建议吗?

1 个答案:

答案 0 :(得分:2)

这是一个简单的“重塑”问题。在基数R中,您可以这样做:

reshape(prop_attack, direction = "wide", idvar="attack.type", timevar="class")
#   attack.type proportion.high proportion.mid
# 1           4       0.8400000      0.6666667
# 2           5       0.9733333      0.9965636
# 3           6       0.9385151      0.9687825
# 4           7       0.9228659             NA
# 5           8       0.6187500             NA
# 6           9       0.9219331             NA
# 7           1              NA      0.8364853
# 8           2              NA      0.9896870
# 9           3              NA      0.9529760

甚至可以使用xtabs

xtabs(proportion ~ attack.type + class, prop_attack)
#            class
# attack.type      high       mid
#           1 0.0000000 0.8364853
#           2 0.0000000 0.9896870
#           3 0.0000000 0.9529760
#           4 0.8400000 0.6666667
#           5 0.9733333 0.9965636
#           6 0.9385151 0.9687825
#           7 0.9228659 0.0000000
#           8 0.6187500 0.0000000
#           9 0.9219331 0.0000000

使用一个软件包,很多人会建议“reshape2”dcast使用方便的语法:

dcast(prop_attack, attack.type ~ class, value.var="proportion")
#   attack.type      high       mid
# 1           1        NA 0.8364853
# 2           2        NA 0.9896870
# 3           3        NA 0.9529760
# 4           4 0.8400000 0.6666667
# 5           5 0.9733333 0.9965636
# 6           6 0.9385151 0.9687825
# 7           7 0.9228659        NA
# 8           8 0.6187500        NA
# 9           9 0.9219331        NA