我有一组形式的赛车运动数据:
car lap laptime
1 1 1 138.523
2 1 2 122.373
3 1 3 121.395
4 2 1 137.871
5 2 2 121.059
6 2 3 125.720
7 2 4 125.620
8 3 1 140.764
9 3 2 123.579
10 3 3 124.799
11 3 4 124.035
我想制作一些形式的东西:
lap car.1 car.2 car.3
1 1 138.523 137.871 140.764
2 2 122.373 121.059 123.579
3 3 121.395 125.720 124.799
4 4 NA 125.620 124.035
我可以将其用作热图图表的基础。
我可以看到如何在像Python这样的东西中进行重塑,但我很难在R中找到一种优雅的方法(我肯定必须有几种这样的方式)?
作为扩展,如何生成car.1.diff,car2.diff等列,以使car.1.diff中的值对应car.1.laptime-min(car.1.laptimes) ,car.2.diff对应car.2.laptime-min(car.2.laptimes)等?
答案 0 :(得分:2)
以下是使用reshape
和plyr
软件包的解决方案:
# read example data
tmp1 <- read.table(textConnection(" car lap laptime
1 1 1 138.523
2 1 2 122.373
3 1 3 121.395
4 2 1 137.871
5 2 2 121.059
6 2 3 125.720
7 2 4 125.620
8 3 1 140.764
9 3 2 123.579
10 3 3 124.799
11 3 4 124.035"))
# calculate differences
R> library("plyr")
R> tmp2 <- ddply(tmp1, .(car), summarize, lap=lap, diff=laptime-min(laptime))
R> tmp2
car lap diff
1 1 1 17.128
2 1 2 0.978
3 1 3 0.000
4 2 1 16.812
5 2 2 0.000
6 2 3 4.661
7 2 4 4.561
8 3 1 17.185
9 3 2 0.000
10 3 3 1.220
11 3 4 0.456
# conversion to wide format
R> library("reshape")
R> cast(tmp1, lap ~ car, value=c("laptime"))
lap 1 2 3
1 1 138.5 137.9 140.8
2 2 122.4 121.1 123.6
3 3 121.4 125.7 124.8
4 4 NA 125.6 124.0
R> cast(tmp2, lap ~ car, value=c("diff"))
lap 1 2 3
1 1 17.128 16.812 17.185
2 2 0.978 0.000 0.000
3 3 0.000 4.661 1.220
4 4 NA 4.561 0.456