在R中,如何创建数据框的转换子集?

时间:2016-11-10 03:11:53

标签: r transform subset

这里的第一个问题!我在Windows 10上使用R 3.3.1(64位)。

我将数据存储在名为lwd的数据框中。数据按“晶圆”因子分组,每个晶圆上有10个位置(称为“点”),其中测量了4个不同的参数(v1,v2,v3,v4)(因此可视化5个硅片,10个每个晶圆的位置,每个位置有四个不同的测量值。总共50行)。

数据在R(前20行)中的外观样本

> lwd
   data wafer point    v1       v2   v3    v4
1     1    T3     1 0.3450 -1.3423 51.21 15.853
2     2    T3     2 0.3473 -1.5756 45.44 15.667
3     3    T3     3 0.3441 -1.3486 39.57 15.894
4     4    T3     4 0.3478 -1.7150 44.67 15.600
5     5    T3     5 0.3482 -1.4154 42.02 15.683
6     6    T3     6 0.3478 -1.4477 38.66 15.693
7     7    T3     7 0.3430 -1.3210 41.96 15.955
8     8    T3     8 0.3458 -1.6119 43.41 15.721
9     9    T3     9 0.3451 -1.4688 35.19 15.802
10   10    T3    10 0.3446 -1.4078 45.82 15.850
11   11    T1     1 0.3412 -3.2319 37.51 15.381
12   12    T1     2 0.3450 -3.2202 41.69 15.233
13   13    T1     3 0.3415 -3.1850 32.21 15.383
14   14    T1     4 0.3442 -3.2748 40.77 15.248
15   15    T1     5 0.3470 -3.3064 35.06 15.126
16   16    T1     6 0.3453 -3.3552 31.67 15.178
17   17    T1     7 0.3416 -3.4090 35.29 15.310
18   18    T1     8 0.3462 -3.2323 38.30 15.179
19   19    T1     9 0.3428 -3.4104 29.13 15.262
20   20    T1    10 0.3452 -3.5293 40.57 15.129
...
50   50    W2    10 0.3475 -2.8963 42.07 15.231

对于v1到v4中的每一个,我想创建一个看起来像这样的变换子集(例子):

>v1.group
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
T1 0.3412 0.3450 0.3415 0.3442 0.3470 0.3453 0.3416 0.3462 0.3428 0.3452
T3 0.3450 0.3473 0.3441 0.3478 0.3482 0.3478 0.3430 0.3458 0.3451 0.3446
W1 0.3521 0.3540 0.3555 0.3537 0.3550 0.3551 0.3514 0.3536 0.3547 0.3531
W2 0.3483 0.3503 0.3469 0.3477 0.3518 0.3511 0.3447 0.3485 0.3477 0.3475
W3 0.3430 0.3447 0.3462 0.3444 0.3468 0.3460 0.3425 0.3444 0.3430 0.3437

每行对应晶圆,每列是测量位置('点')1到10.我很乐意一次一个地在v1-v4上工作,但我想有一种方法吐出v1.group,v2.group ......等。用一个命令。我很久以前就已经看过这个了,没有添加库,但我一直无法追踪它。

希望我做得对:这里有一些代码可以重现我数据集的前20行。

structure(list(data = 1:20, wafer = structure(c(2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = c("T1", "T3", "W1", "W2", "W3"), class = "factor"), 
    point = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), v1 = c(0.345, 0.3473, 
    0.3441, 0.3478, 0.3482, 0.3478, 0.343, 0.3458, 0.3451, 0.3446, 
    0.3412, 0.345, 0.3415, 0.3442, 0.347, 0.3453, 0.3416, 0.3462, 
    0.3428, 0.3452), v2 = c(-1.3423, -1.5756, -1.3486, -1.715, 
    -1.4154, -1.4477, -1.321, -1.6119, -1.4688, -1.4078, -3.2319, 
    -3.2202, -3.185, -3.2748, -3.3064, -3.3552, -3.409, -3.2323, 
    -3.4104, -3.5293), v3 = c(51.21, 45.44, 39.57, 44.67, 42.02, 
    38.66, 41.96, 43.41, 35.19, 45.82, 37.51, 41.69, 32.21, 40.77, 
    35.06, 31.67, 35.29, 38.3, 29.13, 40.57), v4 = c(15.853, 
    15.667, 15.894, 15.6, 15.683, 15.693, 15.955, 15.721, 15.802, 
    15.85, 15.381, 15.233, 15.383, 15.248, 15.126, 15.178, 15.31, 
    15.179, 15.262, 15.129)), .Names = c("data", "wafer", "point", 
"v1", "v2", "v3", "v4"), row.names = c(NA, 20L), class = "data.frame")

感谢。我期待着您的帮助,并成为社区的一员。

2 个答案:

答案 0 :(得分:4)

我们可以循环执行此操作并使用dcast中的data.table(或者如果我们需要matrix,那么我们可以将dcast更改为acast }(来自reshape2

library(data.table)

lapply(grep('v\\d+', names(lwd)), function(i) dcast(setnames(setDT(lwd[c(1:3, i)]), 
               4, 'v'), wafer~point, value.var = "v"))

或其他选项xtabs来自base R

lapply(grep('v\\d+', names(lwd)), function(i) 
        xtabs(v~wafer+point, transform(lwd[c(2:3)], v = lwd[,i])))

如果我们需要3D数组,正如@thelatemail所提到的,我们可以直接应用xtabs

xtabs(cbind(v1,v2,v3,v4) ~ wafer + point, data=lwd)

答案 1 :(得分:2)

在基础R重塑中变宽,然后使用不同变量再次变长。

out <- reshape(lwd[-1], idvar="wafer", timevar="point", direction="wide")
names(out)[-1] <- gsub("(.+?)\\.(.+)", "\\2.\\1", names(out)[-1] )
reshape(out, idvar="wafer", direction="long", sep=".", varying=-1)

#      wafer time       1       2       3       4       5       6       7       8       9      10
#T3.v1    T3   v1  0.3450  0.3473  0.3441  0.3478  0.3482  0.3478  0.3430  0.3458  0.3451  0.3446
#T1.v1    T1   v1  0.3412  0.3450  0.3415  0.3442  0.3470  0.3453  0.3416  0.3462  0.3428  0.3452
#T3.v2    T3   v2 -1.3423 -1.5756 -1.3486 -1.7150 -1.4154 -1.4477 -1.3210 -1.6119 -1.4688 -1.4078
#T1.v2    T1   v2 -3.2319 -3.2202 -3.1850 -3.2748 -3.3064 -3.3552 -3.4090 -3.2323 -3.4104 -3.5293
#T3.v3    T3   v3 51.2100 45.4400 39.5700 44.6700 42.0200 38.6600 41.9600 43.4100 35.1900 45.8200
#T1.v3    T1   v3 37.5100 41.6900 32.2100 40.7700 35.0600 31.6700 35.2900 38.3000 29.1300 40.5700
#T3.v4    T3   v4 15.8530 15.6670 15.8940 15.6000 15.6830 15.6930 15.9550 15.7210 15.8020 15.8500
#T1.v4    T1   v4 15.3810 15.2330 15.3830 15.2480 15.1260 15.1780 15.3100 15.1790 15.2620 15.1290