如何向数据中添加部分相关的数字

时间:2017-11-30 15:09:30

标签: r

我正在尝试为属于每个数据的数据添加数字

我的数据就像

df <- structure(list(data = structure(c(1L, 1L, 1L, 1L, 1L, 3L, 3L, 
4L, 4L, 5L, 5L, 6L, 5L, 7L, 7L, 8L, 8L, 2L, 2L, 2L), .Label = c("data1", 
"data10", "data2", "data3", "data4", "data5", "data6", "data7"
), class = "factor"), values = structure(c(3L, 8L, 18L, 1L, 15L, 
17L, 19L, 7L, 2L, 2L, 11L, 10L, 6L, 4L, 9L, 12L, 14L, 5L, 13L, 
16L), .Label = c("112864.443", "11319531", "12874.443", "142983324", 
"1612410048", "16349475.63", "184901841", "2223793.8", "30553282.01", 
"312004.547", "3135868.44", "317403612.9", "3686081.063", "43701608", 
"623793.8", "64959501.42", "67666215", "767666215", "775987137.8"
), class = "factor")), .Names = c("data", "values"), class = "data.frame", row.names = c(NA, 
-20L))

我想在每个第一列之后得到确切的值。因为它们不是连续的,所以我不知道如何将它们添加到单独的列中。欲望输出应如下所示

data    values  
data1   12874.443   1
data1   2223793.8   1
data1   767666215   1
data1   112864.443  1
data1   623793.8    1
data2   67666215    2
data2   775987137.8 2
data3   184901841   3
data3   11319531    3
data4   11319531    4
data4   3135868.44  4
data5   312004.547  5
data4   16349475.63 4
data6   142983324   6
data6   30553282.01 6
data7   317403612.9 7
data7   43701608    7
data10  1612410048  10
data10  3686081.063 10
data10  64959501.42 10

1 个答案:

答案 0 :(得分:0)

一种方法是使用gsub提取值并将其添加为另一列

df$label <- gsub("[^[:digit:]]", "", df$data)

另一种方法是使用str_extract,因为这个问题R: split character data into numbers and letters

library(stringr)
df$label <- as.numeric(str_extract(df$data, "[0-9]+"))



 > df
   #      data      values label
   # 1   data1   12874.443     1
   # 2   data1   2223793.8     1
   # 3   data1   767666215     1
   # 4   data1  112864.443     1
   # 5   data1    623793.8     1
   # 6   data2    67666215     2
   # 7   data2 775987137.8     2
   # 8   data3   184901841     3
   # 9   data3    11319531     3
   # 10  data4    11319531     4
   # 11  data4  3135868.44     4
   # 12  data5  312004.547     5
   # 13  data4 16349475.63     4
   # 14  data6   142983324     6
   # 15  data6 30553282.01     6
   # 16  data7 317403612.9     7
   # 17  data7    43701608     7
   # 18 data10  1612410048    10
   # 19 data10 3686081.063    10
   # 20 data10 64959501.42    10