根据重复观察创建新列

时间:2017-05-08 03:22:08

标签: r append data-manipulation replicate

我有一个问题,如何根据另一个创建一个新列 这是我data的一部分:

Category  Brand    Time1          value   Time2        number   
2         HTC      2015-01-01     1724    NA           1      
6         APPLE    2015-10-10     3000    2015-10-30   1
2         APPLE    2016-01-15     430     NA           1
NA        Samsung  2016-10-20     860     2016-12-20   1

我显示4个障碍物。上面data,我更多地解释data
首先,看看结构。

> str(data)
Classes ‘data.table’ and 'data.frame':  105907 obs. of  6 variables:
$ Category     : num  2 2 2 2 2 2 2 2 2 2 ...
$ Brand        : chr  "HTC" "APPLE" "INFOCUS" "APPLE" ...
$ Time1        : POSIXct, format: "2015-01-01" "2015-01-01" "2015-01-01" "2015-01-01" ...
$ value        : num  1724 2946 330 2946 2946 ...
$ Time2        : POSIXct, format: NA NA NA "2015-01-03" ...
$ number         : chr  "1" "1" "1" "1" ...
- attr(*, ".internal.selfref")=<externalptr>  

其次,我想复制每个障碍物。基于Time1
   这是我的代码:

data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]   

现在,data看起来像是:

Category  Brand    Time1          value   Time2        number   rep
2         HTC      2015-01-01     1724    NA           1        13       
2         HTC      2015-01-01     1724    NA           1        13       
2         HTC      2015-01-01     1724    NA           1        13   
2         HTC      2015-01-01     1724    NA           1        13     
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
2         HTC      2015-01-01     1724    NA           1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
6         APPLE    2015-10-10     3000    2015-10-30   1        13
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
2         APPLE    2016-01-15     430     NA           1        12
NA        Samsung  2016-10-20     860     2016-12-20   1        3
NA        Samsung  2016-10-20     860     2016-12-20   1        3
NA        Samsung  2016-10-20     860     2016-12-20   1        3

第三,我想创建一个新列Lapse,我想要的结果是:

Category  Brand    Time1          value   Time2        number   rep   Lapse
2         HTC      2015-01-01     1724    NA           1        13    0   
2         HTC      2015-01-01     1724    NA           1        13    1   
2         HTC      2015-01-01     1724    NA           1        13    2
2         HTC      2015-01-01     1724    NA           1        13    3 
2         HTC      2015-01-01     1724    NA           1        13    4
2         HTC      2015-01-01     1724    NA           1        13    5
2         HTC      2015-01-01     1724    NA           1        13    6
2         HTC      2015-01-01     1724    NA           1        13    7 
2         HTC      2015-01-01     1724    NA           1        13    8
2         HTC      2015-01-01     1724    NA           1        13    9 
2         HTC      2015-01-01     1724    NA           1        13    10 
2         HTC      2015-01-01     1724    NA           1        13    11
2         HTC      2015-01-01     1724    NA           1        13    12
6         APPLE    2015-10-10     3000    2015-10-30   1        13    0
6         APPLE    2015-10-10     3000    2015-10-30   1        13    1
6         APPLE    2015-10-10     3000    2015-10-30   1        13    2
6         APPLE    2015-10-10     3000    2015-10-30   1        13    3
6         APPLE    2015-10-10     3000    2015-10-30   1        13    4
6         APPLE    2015-10-10     3000    2015-10-30   1        13    5
6         APPLE    2015-10-10     3000    2015-10-30   1        13    6
6         APPLE    2015-10-10     3000    2015-10-30   1        13    7
6         APPLE    2015-10-10     3000    2015-10-30   1        13    8
6         APPLE    2015-10-10     3000    2015-10-30   1        13    9 
6         APPLE    2015-10-10     3000    2015-10-30   1        13    10
6         APPLE    2015-10-10     3000    2015-10-30   1        13    11
6         APPLE    2015-10-10     3000    2015-10-30   1        13    12 
2         APPLE    2016-01-15     430     NA           1        12    0
2         APPLE    2016-01-15     430     NA           1        12    1
2         APPLE    2016-01-15     430     NA           1        12    2
2         APPLE    2016-01-15     430     NA           1        12    3
2         APPLE    2016-01-15     430     NA           1        12    4
2         APPLE    2016-01-15     430     NA           1        12    5 
2         APPLE    2016-01-15     430     NA           1        12    6 
2         APPLE    2016-01-15     430     NA           1        12    7
2         APPLE    2016-01-15     430     NA           1        12    8
2         APPLE    2016-01-15     430     NA           1        12    9
2         APPLE    2016-01-15     430     NA           1        12    10 
2         APPLE    2016-01-15     430     NA           1        12    11
NA        Samsung  2016-10-20     860     2016-12-20   1        3     0
NA        Samsung  2016-10-20     860     2016-12-20   1        3     1
NA        Samsung  2016-10-20     860     2016-12-20   1        3     2

以上是我想要的结果,我尝试这样的代码:

data[,Lapse := seq_len(.N)-1, by = (Category,Brand,Time1,value,Time2,number)]   

然而,这是错误的。

如果是的话,

uniqie(data$Lapse) 
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12 

但是,我得到了0~999。我认为我的代码是错误的 有什么建议吗? 或者也许有其他好方法可以这样做?

enter image description here

更新

data <- "    Category        Brand Time1 value Time2 number
1:        2          HTC    2015-01-01    1724       NA    1
2:        2        APPLE    2015-01-01    2946       NA    1
3:        2      INFOCUS    2015-01-01     330       NA    1
4:        2        APPLE    2015-01-01    2946 2015-01-03    1
5:        2        APPLE    2015-01-01    2946       NA    1
6:        2      Samsung    2015-01-01    2189       NA    1
7:        2          HTC    2015-01-01     730       NA    1
8:        2      Samsung    2015-01-01    2189       NA    1
9:        2      Samsung    2015-01-01    2189       NA    1
10:        2          HTC    2015-01-01    1296       NA    1
11:        2          HTC    2015-01-01     730       NA    1
12:        2        APPLE    2015-01-01    2189       NA    1
13:        2      INFOCUS    2015-01-01     330 2015-01-02    1
14:        2          HTC    2015-01-01    2189       NA    1
15:        2         SONY    2015-01-01    1296       NA    1
16:        2          HTC    2015-01-01     730       NA    1
17:        2        APPLE    2015-01-01    2946       NA    1
18:        2        APPLE    2015-01-01    2946       NA    1
19:        2          HTC    2015-01-01    1724       NA    1
20:        2      Samsung    2015-01-02    1724       NA    1
21:        2      Samsung    2015-01-02    2189       NA    1
22:        2          HTC    2015-01-02     730       NA    1
23:        2      Samsung    2015-01-02    2189       NA    1
24:        2          HTC    2015-01-02     730       NA    1
25:        2        APPLE    2015-01-02    2946       NA    1
26:        2          HTC    2015-01-02    1724       NA    1
27:        2          HTC    2015-01-02     730       NA    1
28:        2         ASUS    2015-01-02     330       NA    1
29:        2         ASUS    2015-01-02     330       NA    1
30:        2      Samsung    2015-01-02    1724       NA    1
31:        2        APPLE    2015-01-02    2189       NA    1
32:        2          HTC    2015-01-02     730       NA    1
33:        2      Samsung    2015-01-02     730       NA    1
34:        2          HTC    2015-01-02     730       NA    1
35:        2          HTC    2015-01-02     730       NA    1
36:        2          HTC    2015-01-02     730       NA    1
37:        2      Samsung    2015-01-02     730       NA    1
38:        2        APPLE    2015-01-03    2189       NA    1
39:        2        APPLE    2015-01-03    2946       NA    1
40:        2       Benten    2015-01-03     330       NA    1
41:        2        APPLE    2015-01-03    2946       NA    1
42:        2      INFOCUS    2015-01-03     330       NA    1
43:        2      Samsung    2015-01-03    1296       NA    1
44:        2          HTC    2015-01-03     730       NA    1
45:        2      Samsung    2015-01-03    2189       NA    1
46:        2         SONY    2015-01-03    2189       NA    1
47:        2 TaiwanMobile    2015-01-03     730       NA    1
48:        2          HTC    2015-01-03    1296       NA    1
49:        2          HTC    2015-01-03     730       NA    1
50:        2        APPLE    2015-01-03    2189       NA    1
51:        2        APPLE    2015-01-03    2189       NA    1
52:        2          HTC    2015-01-03     730       NA    1
53:        2      Samsung    2015-01-03     330       NA    1
54:        2 TaiwanMobile    2015-01-03     730       NA    1
55:        2          HTC    2015-01-03     730       NA    1
56:        2          HTC    2015-01-03     730       NA    1
57:        2 TaiwanMobile    2015-01-03     330       NA    1
58:        2      Samsung    2015-01-03    1724 2015-01-04    1
59:        2          HTC    2015-01-03     730       NA    1
60:        2      INFOCUS    2015-01-03     330       NA    1
61:        2         SONY    2015-01-03     730       NA    1
62:        2          HTC    2015-01-04     730       NA    1
63:        2          HTC    2015-01-04     730       NA    1
64:        2        APPLE    2015-01-04    2189 2015-01-05    1
65:        2 TaiwanMobile    2015-01-04     730 2015-01-05    1"  

data <- read.table(text=data, header = TRUE)
data <- as.data.table(data)
data <- data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]
data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]

dput(droplevels(head(data,65)))
structure(list(Category = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Brand = c("HTC", "APPLE", 
"INFOCUS", "APPLE", "APPLE", "Samsung", "HTC", "Samsung", "Samsung", 
"HTC", "HTC", "APPLE", "INFOCUS", "HTC", "SONY", "HTC", "APPLE", 
"APPLE", "HTC", "Samsung", "Samsung", "HTC", "Samsung", "HTC", 
"APPLE", "HTC", "HTC", "ASUS", "ASUS", "Samsung", "APPLE", "HTC", 
"Samsung", "HTC", "HTC", "HTC", "Samsung", "APPLE", "APPLE", 
"Benten", "APPLE", "INFOCUS", "Samsung", "HTC", "Samsung", "SONY", 
"TaiwanMobile", "HTC", "HTC", "APPLE", "APPLE", "HTC", "Samsung", 
"TaiwanMobile", "HTC", "HTC", "TaiwanMobile", "Samsung", "HTC", 
"INFOCUS", "SONY", "HTC", "HTC", "APPLE", "TaiwanMobile"), Time1 = structure(c(1420070400, 
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 
1420329600, 1420329600, 1420329600, 1420329600), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), value = c(1724, 2946, 330, 2946, 
2946, 2189, 730, 2189, 2189, 1296, 730, 2189, 330, 2189, 1296, 
730, 2946, 2946, 1724, 1724, 2189, 730, 2189, 730, 2946, 1724, 
730, 330, 330, 1724, 2189, 730, 730, 730, 730, 730, 730, 2189, 
2946, 330, 2946, 330, 1296, 730, 2189, 2189, 730, 1296, 730, 
2189, 2189, 730, 330, 730, 730, 730, 330, 1724, 730, 330, 730, 
730, 730, 2189, 730), Time2 = structure(c(NA, NA, NA, 1420243200, 
NA, NA, NA, NA, NA, NA, NA, NA, 1420156800, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, 1420329600, NA, NA, NA, NA, NA, 1420416000, 
1420416000), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
number = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
"1", "1", "1", "1", "1", "1", "1")), .Names = c("Category", 
"Brand", "Time1", "value", "Time2", "number"), row.names = c(NA, 
-65L), .internal.selfref = <pointer: 0x003e24a0>, class = c("data.table", 
"data.frame"))

问题是结果很奇怪。

unique(data$Lapse)
[1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[38] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
[75] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

1 个答案:

答案 0 :(得分:1)

问题是原始数据没有更新,因为它具有与以前相同的行数。如果我们检查输出

data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]

然后

data

很明显。

因此,我们将两步的输出分配回原始对象(&#39;数据&#39;)或另一个对象(如果我们不想更改原始对象)

data <-  data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]

并创建&#39; Lapse&#39;基于按组排序的列

data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]
data
    Category   Brand      Time1 value      Time2 number rep Lapse
 1:        2     HTC 2015-01-01  1724       <NA>      1  13     0
 2:        2     HTC 2015-01-01  1724       <NA>      1  13     1
 3:        2     HTC 2015-01-01  1724       <NA>      1  13     2
 4:        2     HTC 2015-01-01  1724       <NA>      1  13     3
 5:        2     HTC 2015-01-01  1724       <NA>      1  13     4
 6:        2     HTC 2015-01-01  1724       <NA>      1  13     5
 7:        2     HTC 2015-01-01  1724       <NA>      1  13     6
 8:        2     HTC 2015-01-01  1724       <NA>      1  13     7
 9:        2     HTC 2015-01-01  1724       <NA>      1  13     8
10:        2     HTC 2015-01-01  1724       <NA>      1  13     9
11:        2     HTC 2015-01-01  1724       <NA>      1  13    10
12:        2     HTC 2015-01-01  1724       <NA>      1  13    11
13:        2     HTC 2015-01-01  1724       <NA>      1  13    12
14:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     0
15:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     1
16:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     2
17:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     3
18:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     4
19:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     5
20:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     6
21:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     7
22:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     8
23:        6   APPLE 2015-10-10  3000 2015-10-30      1  13     9
24:        6   APPLE 2015-10-10  3000 2015-10-30      1  13    10
25:        6   APPLE 2015-10-10  3000 2015-10-30      1  13    11
26:        6   APPLE 2015-10-10  3000 2015-10-30      1  13    12
27:        2   APPLE 2016-01-15   430       <NA>      1  12     0
28:        2   APPLE 2016-01-15   430       <NA>      1  12     1
29:        2   APPLE 2016-01-15   430       <NA>      1  12     2
30:        2   APPLE 2016-01-15   430       <NA>      1  12     3
31:        2   APPLE 2016-01-15   430       <NA>      1  12     4
32:        2   APPLE 2016-01-15   430       <NA>      1  12     5
33:        2   APPLE 2016-01-15   430       <NA>      1  12     6
34:        2   APPLE 2016-01-15   430       <NA>      1  12     7
35:        2   APPLE 2016-01-15   430       <NA>      1  12     8
36:        2   APPLE 2016-01-15   430       <NA>      1  12     9
37:        2   APPLE 2016-01-15   430       <NA>      1  12    10
38:        2   APPLE 2016-01-15   430       <NA>      1  12    11
39:       NA Samsung 2016-10-20   860 2016-12-20      1   3     0
40:       NA Samsung 2016-10-20   860 2016-12-20      1   3     1
41:       NA Samsung 2016-10-20   860 2016-12-20      1   3     2