更改字符串中的数字

时间:2019-03-21 16:54:45

标签: r

我有一个巨大的数据框。但是,我有相似的ID,这些是不同的观察结果。我想做的是更改“ ID”列中的最后/最后两位数字。因此,第5行中的ID alnfru_00001应该是alnfru_00006,而第34行应该是carlin_00005,而不是carlin_00001,依此类推。即使在使用其他ID时,也可以在整个数据帧中看到此序列。谁能提供一些帮助?

我应该先将数据分组吗?

注意:我不希望尾随数字跟踪行号

           ID          PFT        area
    1    alnfru_00001 alnfru Yukon_Delta
    2    alnfru_00002 alnfru Yukon_Delta
    3    alnfru_00003 alnfru Yukon_Delta
    4    alnfru_00004 alnfru Yukon_Delta
    5    alnfru_00001 alnfru Yukon_Delta
    6    alnfru_00002 alnfru Yukon_Delta
    7    alnfru_00003 alnfru Yukon_Delta
    8    alnfru_00004 alnfru Yukon_Delta
    9    alnfru_00005 alnfru Yukon_Delta
    26   calcan_00001 calcan Yukon_Delta
    27   calcan_00002 calcan Yukon_Delta
    28   calcan_00003 calcan Yukon_Delta
    29   calcan_00004 calcan Yukon_Delta
    30   carlin_00001 carlin Yukon_Delta
    31   carlin_00002 carlin Yukon_Delta
    32   carlin_00003 carlin Yukon_Delta
    33   carlin_00004 carlin Yukon_Delta
    34   carlin_00001 carlin Yukon_Delta
    18   alnfru_00001 alnfru Yukon_Delta
    19   alnfru_00002 alnfru Yukon_Delta
    20   alnfru_00003 alnfru Yukon_Delta
    21   alnfru_00004 alnfru Yukon_Delta
    22   alnfru_00001 alnfru Yukon_Delta
    23   alnfru_00002 alnfru Yukon_Delta
    24   alnfru_00003 alnfru Yukon_Delta
    25   alnfru_00004 alnfru Yukon_Delta

数据框应如下所示

        ID          PFT        area
1    alnfru_00001 alnfru Yukon_Delta
2    alnfru_00002 alnfru Yukon_Delta
3    alnfru_00003 alnfru Yukon_Delta
4    alnfru_00004 alnfru Yukon_Delta
5    alnfru_00005 alnfru Yukon_Delta
6    alnfru_00006 alnfru Yukon_Delta
7    alnfru_00007 alnfru Yukon_Delta
8    alnfru_00008 alnfru Yukon_Delta
9    alnfru_00009 alnfru Yukon_Delta
26   calcan_00001 calcan Yukon_Delta
27   calcan_00002 calcan Yukon_Delta
28   calcan_00003 calcan Yukon_Delta
29   calcan_00004 calcan Yukon_Delta
30   carlin_00001 carlin Yukon_Delta
31   carlin_00002 carlin Yukon_Delta
32   carlin_00003 carlin Yukon_Delta
33   carlin_00004 carlin Yukon_Delta
34   carlin_00005 carlin Yukon_Delta
18   alnfru_00010 alnfru Yukon_Delta
19   alnfru_00011 alnfru Yukon_Delta
20   alnfru_00012 alnfru Yukon_Delta
21   alnfru_00013 alnfru Yukon_Delta
22   alnfru_00014 alnfru Yukon_Delta
23   alnfru_00015 alnfru Yukon_Delta
24   alnfru_00016 alnfru Yukon_Delta
25   alnfru_00017 alnfru Yukon_Delta

1 个答案:

答案 0 :(得分:2)

可以做到:

library(dplyr)

df %>%
  group_by(PFT, area) %>%
  mutate(
    ID = as.character(ID),
    ID = paste0(substr(ID, 1, nchar(ID) - nchar(row_number())), row_number()))

输出:

             ID    PFT        area
1  alnfru_00001 alnfru Yukon_Delta
2  alnfru_00002 alnfru Yukon_Delta
3  alnfru_00003 alnfru Yukon_Delta
4  alnfru_00004 alnfru Yukon_Delta
5  alnfru_00005 alnfru Yukon_Delta
6  alnfru_00006 alnfru Yukon_Delta
7  alnfru_00007 alnfru Yukon_Delta
8  alnfru_00008 alnfru Yukon_Delta
9  alnfru_00009 alnfru Yukon_Delta
10 calcan_00001 calcan Yukon_Delta
11 calcan_00002 calcan Yukon_Delta
12 calcan_00003 calcan Yukon_Delta
13 calcan_00004 calcan Yukon_Delta
14 carlin_00001 carlin Yukon_Delta
15 carlin_00002 carlin Yukon_Delta
16 carlin_00003 carlin Yukon_Delta
17 carlin_00004 carlin Yukon_Delta
18 carlin_00005 carlin Yukon_Delta
19 alnfru_00010 alnfru Yukon_Delta
20 alnfru_00011 alnfru Yukon_Delta
21 alnfru_00012 alnfru Yukon_Delta
22 alnfru_00013 alnfru Yukon_Delta
23 alnfru_00014 alnfru Yukon_Delta
24 alnfru_00015 alnfru Yukon_Delta
25 alnfru_00016 alnfru Yukon_Delta
26 alnfru_00017 alnfru Yukon_Delta