从两列数据表创建基于组的列,其中R中的不同组大小

时间:2011-09-08 08:42:01

标签: r

我有一组形式的赛车运动数据:

  car lap laptime
1  1   1 138.523
2  1   2 122.373
3  1   3 121.395
4  2   1 137.871
5  2   2 121.059
6  2   3 125.720
7  2   4 125.620
8  3   1 140.764
9  3   2 123.579
10  3   3 124.799
11  3   4 124.035

我想制作一些形式的东西:

  lap  car.1    car.2     car.3
1  1   138.523  137.871   140.764
2  2   122.373  121.059   123.579
3  3   121.395  125.720   124.799
4  4   NA       125.620   124.035

我可以将其用作热图图表的基础。

我可以看到如何在像Python这样的东西中进行重塑,但我很难在R中找到一种优雅的方法(我肯定必须有几种这样的方式)?

作为扩展,如何生成car.1.diff,car2.diff等列,以使car.1.diff中的值对应car.1.laptime-min(car.1.laptimes) ,car.2.diff对应car.2.laptime-min(car.2.laptimes)等?

1 个答案:

答案 0 :(得分:2)

以下是使用reshapeplyr软件包的解决方案:

# read example data
tmp1 <- read.table(textConnection("  car lap laptime
1  1   1 138.523
2  1   2 122.373
3  1   3 121.395
4  2   1 137.871
5  2   2 121.059
6  2   3 125.720
7  2   4 125.620
8  3   1 140.764
9  3   2 123.579
10  3   3 124.799
11  3   4 124.035"))

# calculate differences
R> library("plyr")
R> tmp2 <- ddply(tmp1, .(car), summarize, lap=lap, diff=laptime-min(laptime))
R> tmp2
   car lap   diff
1    1   1 17.128
2    1   2  0.978
3    1   3  0.000
4    2   1 16.812
5    2   2  0.000
6    2   3  4.661
7    2   4  4.561
8    3   1 17.185
9    3   2  0.000
10   3   3  1.220
11   3   4  0.456

# conversion to wide format
R> library("reshape")
R> cast(tmp1, lap ~ car, value=c("laptime"))
  lap     1     2     3
1   1 138.5 137.9 140.8
2   2 122.4 121.1 123.6
3   3 121.4 125.7 124.8
4   4    NA 125.6 124.0

R> cast(tmp2, lap ~ car, value=c("diff"))
  lap      1      2      3
1   1 17.128 16.812 17.185
2   2  0.978  0.000  0.000
3   3  0.000  4.661  1.220
4   4     NA  4.561  0.456